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Chapter  1 


Introduction 


A  recent  development  in  the  continuing  search  for  faster  methods  of  circuit  simula¬ 
tion  has  been  the  advent  of  special-purpose  hardware  dedicated  to  simulation.  With 
a  large  number  of  small  elements  working  in  parallel,  increases  in  speed  of  two  to 
four  orders  of  magnitude  over  general-purpose  computers  are  not  uncommon.  While 
this  equipment  is  well-suited  to  simulation  at  the  gate  level,  it  has  not  been  very 
successful  at  simulating  MOS  networks  due  to  the  bidirectionality  of  MO  S  transis¬ 
tors.  A  variety  of  special-purpose  hardware  elements  have  been  designed  to  capture 
this  characteristic  of  MOS  transistors  with  limited  success.  A  more  promising  ap¬ 
proach  has  been  demonstrated  by  the  COSMOS  (Compiled  Simulator  for  MOS) 
simulator,  which  symbolically  analyzes  the  behavior  of  a  MOS  circuit  and  then 
represents  it  in  purely  Boolean  terms.  The  expressions  generated  by  COSMOS 
can  then  be  evaluated  on  hardware  designed  for  gate-level  simulation. 

The  Boolean  expressions  generated  are  multi-level  with  shared  subexpressions. 
While  simplification  techniques  for  less  complex  Boolean  expressions  exist,  none 
are  applicable  to  expressions  of  this  type.  When  mapping  these  expressions  onto 
special-purpose  hardware,  a  new  definition  of  simplicity  must  be  made.  Just  as  the 
limiting  factor  for  the  performace  of  logic  is  its  critical  path,  the  limiting  factor 
for  the  performance  of  a  MOS  simulator  using  Boolean  expressions  on  highly  par¬ 
allel  hardware  is  the  depth  of  the  expressions.  However,  optimization  of  Boolean 
expressions  to  minimize  depth  is  an  issue  that  has  not  been  widely  discussed. 
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Gaussian  elimination  is  a  technique  that  has  continually  proved  to  be  useful  for 
problem  solving  in  areas  other  than  systems  of  linear  equations.  In  the  COSMOS 
software,  Gaussian  elimination  is  used  in  the  analysis  of  subnetworks  to  generate 
Boolean  expressions.  In  this  process,  pivot  selection  is  a  crucial  issue  and  many 
methods  have  been  proposed  to  minimize  the  total  number  of  algebraic  operations 
performed.  The  order  in  which  pivots  are  selected  also  affects  the  structure  of 
operations,  an  effect  which  has  not  been  studied  in  depth. 

Presented  in  this  report  is  a  pivot  selection  algorithm  that  guarantees  that  the 
expressions  generated  by  Gaussian  elimination  will  exhibit  balance  and  will  there¬ 
fore  have  small  depth.  The  remainder  of  this  chapter  is  dedicated  to  explaining 
the  motivation  of  this  research  in  relation  to  the  COSMOS  project.  Chapter  2 
discusses  Gaussian  elimination  and  introduces  a  graph  theoretic  view  of  the  algo¬ 
rithm.  Pivot  selection  is  shown  to  be  important  for  keeping  the  total  complexity  of 
the  solution  of  a  sparse  system  low. 

Graph  separator  theorems  and  nested  dissection  pivot  selection  algorithms  are 
the  subject  of  Chapter  3,  where  it  is  shown  that  an  ordering  based  on  nested  dissec¬ 
tion  results  in  expressions  of  small  depth.  Chapter  4  analyzes  series-parallel  graphs 
and  presents  decomposition  trees,  which  are  shown  to  be  useful  for  finding  sepa¬ 
rators  in  such  graphs.  With  these  separators,  it  is  easy  to  produce  an  elimination 
ordering  for  a  series-parallel  graph  that  results  in  small  depth. 

In  Chapter  5,  an  algorithm  is  described  that  allows  the  construction  of  decom¬ 
position  trees  for  arbitrary  graphs.  These  trees  are  constructed  through  a  procedure 
very  similar  to  Gaussian  elimination.  A  heuristic  strategy  is  used  for  this  first  pass, 
but  it  is  shown  that  an  elimination  ordering  that  results  in  low  total  complexity  can 
be  used  to  find  an  elimination  ordering  that  results  in  small  depth  of  the  expres¬ 
sions  generated.  The  method  has  applicability  to  any  system  solved  by  Gaussian 
elimination  but  is  especially  suited  to  those  systems  that  arise  in  the  analysis  of 
MOS  networks.  Finally,  Chapter  6  briefly  discusses  the  outcome  of  this  work  and 
suggests  areas  for  future  research.  Also,  Appendix  A  is  provided  to  explain  some 
basic  graph  theoretical  terms  used  in  this  report. 
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Figixre  1:  COSMOS  Implementation 

1.1  COSMOS 

COSMOS  consists  of  a  set  of  programs  configured  as  shown  in  Figure  1.  The  sym¬ 
bolic  analyzer  AN  AMOS  accepts  as  input  the  switch-level  representation  of  a  MO  S 
circuit  and  partitions  it  into  a  set  of  channel-connected  subnetworks.  It  then  derives 
a  Boolean  representation  of  the  behavior  of  each  subnetwork.  A  second  program, 
LGCC,  translates  this  Boolean  representation  into  a  set  of  C  language  evaluation 
procedures  plus  declarations  of  data  structures  describing  the  interconnections  of 
the  networks.  Finally,  the  code  produced  by  LGCC,  together  with  the  simulation 
kernel  and  \iser  interface  code,  are  compiled  to  generate  the  simulation  program. 
The  resulting  programs  appears  to  the  user  much  like  an  ordinary  simulator,  ex¬ 
cept  that  the  network  is  already  loaded  at  the  start  of  execution.  The  simulator 
implements  a  block-level,  event-driven  scheduler,  in  which  blocks  correspond  to 
subnetworks.  Processing  an  event  involves  calling  the  appropriate  procedure  to 
recompute  the  outputs  of  a  block. 

Unlike  programs  that  operate  directly  on  the  transistor-level  description  during 
simulation,  COSMOS  preprocesses  the  transistor  network  to  produce  a  Boolean 
description.  This  description,  formulated  by  AN  AMOS,  captures  all  aspects  of 
switch-level  networks,  including  bidirectional  transistors,  stored  charge,  different 
signal  strengths,  and  indeterminate  (X)  logic  values. 
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The  most  novel  aspects  of  COSMOS  aj-e  found  in  ANAMOS.  The  trauisistor 
network  b  partitioned  into  channel-connected  subnetworks  and  the  steady-state  re¬ 
sponse  of  each  subnetwork  b  derived  separately.  Each  subnetwork  corresponds  to 
a  component  of  the  undirected  graph  having  as  vertices  the  storage  nodes  and  as 
edges  the  pairs  of  nodes  connected  by  transbtor  sources  and  drains.  This  parti¬ 
tioning  describes  the  static  connections  in  the  network;  i.e.  those  independent  of 
transbtor  state. 

Typical  MOS  circuits  partition  into  many  small  subnetworks,  aJthough  some 
subnetworks  can  be  quite  large.  Within  a  subnetwork,  the  behavior  can  be  complex 
and  difficult  to  analyze  due  to  the  bidirectionality  of  transbtors  and  the  variety  of 
ways  in  which  state  b  formed  in  a  MOS  circuit.  The  interactions  between  subnet¬ 
works,  however,  are  more  straightforward.  Each  subnetwork  acts  as  a  sequential 
logic  element  having  as  input  the  input  nodes  connected  to  transbtor  sources  and 
drains,  plus  the  gate  nodes  of  the  transbtors.  The  subnetwork  state  b  stored  as 
charge  on  the  storage  nodes.  Its  outputs  are  those  nodes  that  are  gate  nodes  of 
transbtors  in  other  subnetworks. 

To  cast  the  switch-level  model  in  terms  of  Boolean  operations,  a  logic  value 
y  €  {0, 1, X}  b  represented  by  a  “dual  rail”  Boolean  encoding,  y.l,y.O  €  {0,  l},  as 
shown  below: 


y 

y.l 

y.O 

1 

1 

0 

0 

0 

1 

X 

1 

1 

With  a  Boolean  encoding  of  the  state  values,  the  problem  of  symbolic  analysb  can 
be  defined  as  follows.  For  each  node  n,  introduce  Boolean  vairiables  n.l  and  n.O  to 
represent  the  possible  encoded  values  of  the  node  state.  For  each  node  n,  ANAMOS 
derives  Boolean  formulas  N.l  and  N.O  in  terms  of  the  set  of  node  state  variables. 
These  formulas  represent  the  encoded  value  of  the  steady-state  response  at  each 
node  as  a  function  of  the  initial  node  states. 
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Switch-level  networks  resemble  classical  contact  networks  in  that  both  are  com¬ 
posed  of  bidirectional  switching  elements.  Shannon  [28]  first  developed  techniques 
for  analyzing  a  contact  network  symbolically.  In  his  method,  each  contact  is  labeled 
with  a  Boolean  literal  and  the  conditions  under  which  a  path  may  form  between 
designated  pairs  of  terminals  are  formulated  as  a  Boolean  expression.  This  idea 
serves  as  the  conceptual  basis  of  ANAMOS,  although  MOS  circuits  require  a  more 
complex  method  of  analysis.  Furthermore,  most  of  the  methods  presented  in  the 
contact  network  literature  are  not  particularly  well  suited  to  computerized  applica¬ 
tions  for  large  circuits. 


1.2  Gaussian  Elimination 

The  contact  network  analysis  problem  can  be  formulated  as  the  solution  of  a  system 
of  Boolean  equations  [7|.  Solution  of  this  system  relates  closely  to  the  problem  of 
finding  expressions  that  describe  all  paths  between  vertices  in  a  directed  graph. 
Tarjsji  [29]  has  shown  that  a  generalized  form  of  Gaussian  elimination  can  solve  a 
large  class  of  path  problems,  mcluding  contact  network  analysis.  When  solving  a 
system  of  Boolean  equations  by  Gaussian  elimination,  Boolean  operations  A  and  v 
replace  the  conventional  arithmetic  operations. 

Gaussian  elimination  proceeds  in  two  parts:  forward  elimination  and  backsolv- 
ing.  Forward  elimination  propagates  information  about  paths  forward  to  a  single 
vertex.  Backsolving  distributes  information  to  the  vertices  previously  eliminated. 
Forward  elimination  consists  of  repeatedly  selecting  a  vertex  v  (the  *pivot” )  for  re¬ 
moval  from  the  graph.  The  elimination  of  vertex  v  involves  propagating  its  label  to 
each  neighbor  a  through  their  shared  edge.  Furthermore,  the  label  of  an  edge  span¬ 
ning  each  pair  of  neighbors  a  and  a;  may  be  updated  as  well.  Backsolving  proceeds 
by  adding  vertices  back  to  the  graph  in  the  reverse  order  of  their  elimination.  The 
value  on  each  vertex  is  computed  by  summing  the  effects  of  all  neighbors  eliminated 
after  v. 


CHAPTER  1.  INTRODUCTION 


6 


Gaussian  elimination  has  a  distinct  advantage  over  iterative  methods  for  sym¬ 
bolic  analysis.  Being  a  direct  method,  it  requires  no  testing  for  convergence.  Sym¬ 
bolic  analysis  can  proceed  by  simply  constructing  Boolean  formulas  in  terms  of 
operations  A  and  V  in  accordance  with  the  elimination  steps.  A  direct  method 
avoids  the  need  to  test  formulas  for  equivalence,  an  NP-hard  problem. 

1.3  Expression  Representation  and  Pivot  Selec¬ 
tion 

ANAMOS  represents  a  Boolean  formula  as  a  directed  acyclic  graph  (DAG).  A  DAG 
resembles  a  parse  tree  whose  leaves  represent  variables  or  constants  and  whose 
internal  nodes  represent  binary  Boolean  operations.  In  a  DAG,  however,  a  given 
subgraph  may  be  shared  by  several  branches,  yielding  a  more  compact  representa¬ 
tion  |4].  During  the  analysis  of  a  subnetwork,  ANAMOS  constructs  a  single  DAG  with 
multiple  roots,  one  root  for  each  vertex  and  edge  in  the  channel  graph  representing 
the  subnetwork. 

Upon  completion  of  the  execution  of  the  Gaussian  elimination  algorithm,  the 
steady-state  response  of  the  subnetwork  has  been  computed.  The  DAG  contains  one 
root  for  each  vertex  of  the  channel  graph,  and  the  expression  indicated  by  that  root 
gives  the  steady  state  response  at  that  vertex  in  the  graph. 

The  total  number  of  nodes  in  the  DAG,  as  well  as  its  structure,  is  determined  by 
^  the  order  of  operation  on  vertices  in  the  Gaussian  elimination  procedure.  Figure  2 
shows  a  simple  graph  that  could  represent  a  pass  transistor  network  of  six  tran¬ 
sistors,  or  a  network  consisting  of  six  switches.  The  vertices  and  edges  are  labeled 
with  symbolic  values.  In  the  Gaussian  elimination  procedure,  the  vertices  in  the 
graph  could  be  operated  on  in  a  number  of  different  orders. 

If  the  vertices  are  eliminated  from  left  to  right,  then  the  DAG  of  Figure  3,  with 
depth  of  24,  results.  Alternatively,  if  the  vertices  are  eliminated  in  an  order  that 
resembles  that  of  a  nested  dissection  algorithm,  the  DAG  of  Figure  4  results.  This 
DAG  has  a  depth  of  only  11  operations,  less  than  half  of  the  other  one.  Each 
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Figure  2:  A  Simple  Graph  of  Seven  Vertices 

DAG  indicates  the  same  total  number  of  operations,  but  their  structure  is  different. 
The  DAG  of  Figure  3  represents  the  solution  of  a  system  in  which  the  operators 
must  be  applied  to  values  sequentially,  whereas  the  DAG  of  Figure  4  indicates  a 
situation  in  which  some  parallelism  could  be  exploited,  given  a  machine  capable 
of  such  an  evaluation  strategy.  For  longer  chains  of  vertices,  the  results  are  more 
dramatic.  The  left-to-right  elimination  ordering  will  give  expressions  whose  depth 
grows  as  0(n),  while  a  nested  dissection  algorithm  will  give  expressions  that  grow 
as  O(lgn). 

1.4  Parallel  Architectures  /  Simulation  Engines 

The  COSMOS  simulator  as  designed  runs  on  a  general-purpose  computer.  Dur¬ 
ing  simulation,  the  DAGs  representing  a  subnetwork  are  evaluated  for  new  values 
at  each  simulation  clock  cycle.  An  event-driven  scheduling  algorithm  is  used  to 
determine  those  subnetworks  that  need  to  be  evaluated,  since  some  subnetworks 
will  not  require  evaluation  on  every  clock  cycle.  The  total  amount  of  time  required 
to  simulate  a  single  clock  cycle  is  a  function  of  the  total  number  of  operations  in 
each  of  the  DAGs,  and  the  niunber  of  subnetworks  needing  to  be  evaduated. 

Special-purpose  processors  have  been  developed  to  accelerate  tasks  of  logic  gate 
simulation  [9,1,6].  Although  special-purpose  processors  for  switch-level  simulation 
have  been  designed  and  constructed  [8,12],  they  require  a  substantial  amount  of  spe¬ 
cialised  hardware.  It  is  unlikely  that  they  will  ever  achieve  the  cost/performance 
levels  of  processors  that  support  only  gate-level  evaluation.  These  gate-level  sim¬ 
ulation  processors  use  many  simple  function  evaluators  in  parallel  to  achieve  their 
speed. 
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The  DAGs  produced  by  ANAMOS  resemble  simple  logic  networks  with  AND  amd 
OR  operations.  These  expressions  can  be  evaluated  on  hardware  designed  for  gate- 
level  simulation.  If  enough  processors  are  used,  the  DAGS  for  all  subnetworks  can 
be  evaluated  in  parallel  and  an  event-driven  scheduler  would  not  be  needed.  In  this 
scenario,  the  limiting  factor  on  the  amount  of  time  it  takes  to  simulate  a  clock  cycle 
is  the  depth  of  the  deepest  DAG  for  all  of  the  subnetworks. 

It  has  already  been  shown  that  the  structure  of  DAGs  is  influenced  by  the  elim¬ 
ination  ordering  used  during  Gaussian  elimination.  The  rest  of  this  report  is  con¬ 
cerned  with  finding  elimination  orderings  guaranteed  to  lead  to  DAGs  with  small 
depth.  While  the  discussion  is  presented  largely  in  terms  of  Boolean  expressions, 
the  results  are  applicable  to  other  systems  solvable  by  Gaussian  elimination. 


Chapter  2 

Gaussian  Elimination 


2.1  Introduction 

Although  Gaussian  elimination  was  originally  devised  as  a  technique  for  solving 
systems  of  linear  equations,  its  use  has  continually  been  shown  to  be  applicable 
to  problem  solving  in  other  areas.  Examples  are  in  the  solution  of  path  problems, 
the  conversion  of  finite  automata  to  regular  expressions,  and  the  analysis  of  global 
fiow  problems  [29,2].  This  chapter  will  discuss  the  importance  of  pivot  selection  in 
Gaussian  elimination  and  will  introduce  a  graph  theoretic  approach  to  the  problem. 
Systems  of  equations  are  represented  by  the  equation 

Afi  =  6, 

where  M  is  an  n  x  n  matrix,  x  is  an  n  x  1  vector  of  unknowns,  and  6  is  an  n  x  1 
vector  of  constants.  If  M  is  a  symmetric  positive  definite  matrix,  then  it  may  be 
factored  into  the  form  M  =  LDL"^  where  L  a  »,  lower  triangular  matrix  and  D  is 
a  diagonal  matrix.  Gatissian  elimination  may  be  used  to  find  L  directly  and  the 
equations  Ly  6,  DL'^x  =  y  may  be  solved  by  backsubstitution  to  find  a  solution 
to  the  system.  The  total  time  required  to  perform  the  process  is  O(n’)  in  general. 
However,  if  Af  is  sparse^  special  provisions  may  be  made  to  perform  elimination 

‘Spanity  will  not  b«  formally  dafinad  kara.  It  will  ba  conaidarad  to  maaa  a  matrix,  many  of 
whoaa  antriaa  ara  laro. 
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Figure  5:  Solution  of  a  System  of  Equations  Under  Two  Permutations 
more  efficiently. 

The  limiting  factor  in  the  efficiency  of  sparse  Gaussian  elimination  is  the  amount 
of  fill-in  that  occurs  during  the  elimination  process.  Fill-in  is  defined  as  the  number 
of  zero  entries  in  the  matrix  that  become  nonzero  as  elimination  proceeds.  The  total 
amoxmt  of  fill-in  that  arises  from  the  solution  of  a  sparse  system  of  equations  may 
be  limited  to  some  extent  by  selecting  a  proper  elimination  ordering. 

A  permutation  of  the  rows  and  coliimns  of  M  results  in  a  new  matrix  that 
represents  the  same  system  of  equations  as  M.  In  matrix  notation,  a  permutation 
of  the  system  is  defined  by 


{PMP'^)(Px)  =  Pb. 

where  P  is  an  n  x  n  permutation  matrix.  The  new  coefficient  matrix  PMP"^  will  in 
general  have  a  different  fill-in  structure  than  M.  Thus,  a  matrix  may  be  permuted 
before  Gaussian  elimination  is  performed  in  order  to  achieve  greater  efficiency.  An 
example  of  a  system  of  linear  equations  is  shown  in  Figure  5. 

Both  matrices  in  the  figure  represent  the  saime  system  of  equations  under  dif¬ 
ferent  permutations.  The  upper  triangular  form  DL"^  results  after  the  final  step 
of  the  forward  elimination  phase  of  Gaussian  elimination.  Notice  that  the  solution 
of  the  equations  defined  by  M  has  fill-in  terms  while  the  solution  of  PMP"^  does 
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not.  This  figure  clearly  indicates  that  savings  in  computation  time  may  be  made 
by  reordering  the  rows  and  columns  of  a  matrix. 

The  permutation  of  rows  and  columns  need  not  be  explicit;  it  may  be  made 
during  the  elimination  step.  An  interchange  of  rows  and  columns  b  equivalent  to 
the  selection  of  some  arbitrary  nonzero  diagonal  element  of  Af  as  the  pivot.  Various 
methods  of  pivot  selection  based  on  a  matrix  representation  have  been  presented 
[22].  These  will  not  be  dbcussed  in  thb  report.  Instead,  their  graph  theoretic 
counterparts  will  be  introduced  in  Section  2.2. 


2.2  A  Graph  Theoretic  Approach 

Define  G(M)  to  be  the  graph  G  =  (V,  E)  associated  with  JVf ,  such  that  each  variable 
in  the  system  of  equations  b  associated  with  a  vertex  v,-,  t  =  1  •  •  •  n,  and  that  for 
each  nonzero  entry  Mij  there  b  an  edge  (vi,v/)  with  head  v,  and  tail  v,-.  Such  a 
graph  represents  the  nonzero  structure  of  the  matrix  M  [25].  If  M  b  symmetric,  G 
may  be  an  undirected  graph.  However,  if  M  b  not  synametric,  (i.e.,  Mj,i), 

G  will  have  directed  edges. 

An  elimination  ordering  b  a  bijection  o  :  {1, 2,  •  •  • ,  n}  ♦-♦V  and  Ga  =  (V,  E,  a) 
b  an  ordered  graph.  Thb  graph  may  be  used  as  an  aid  in  selecting  an  elimination 
ordering  that  produces  the  minimal  total  fill-in. 

Fill-in  manifests  itself  on  the  graph  G  as  additional  edges  during  the  elimination 
process.  Pivoting  along  a  diagonal  element  in  M  b  equivalent  to  the  removal  of  a 
vertex,  v,  from  the  graph.  The  deficiency  of  v,  i?(v),  b  the  set  of  edges  defined  by 

P(v)  =  {(u,ti;)|(tt,w)  €  E,(v,w)  €  E,(u,w)  ^  E} 
and  represents  the  set  of  fill-in  edges  due  to  the  elimination  of  vertex  v.  The  graph 

G.  =  (v-{v>,f;(v-M)ui?(t»)) 

b  called  the  v- elimination  graph  of  G.  For  an  ordered  graph,  Ga^  the  elimination 
process 

■P(^'a)  =  [G  =  Go,  Gi,  G2,  •  •  • ,  G„_i] 


yLij 
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Figure  6:  The  Fill-in  Resulting  From  Two  Different  Elimination  Orders 


is  the  sequence  of  graphs  that  result  from  the  elimination  of  the  vertices  in  the  order 
specified  by  a.  The  total  fill-in,  F{Ga)  is  given  by 

f(G.)  =  Ux>K). 

«al 

Fill-in  that  occurs  with  the  elimination  of  a  vertex  is  a  function  of  the  elimination 
ordering  a.  An  example  is  shown  in  Figure  6  which  depicts  a  graph  with  two 
elimination  orderings,  a  and  0.  The  elimination  steps  and  the  fill-in  edges  that 
occur  are  shown.  The  total  amount  of  fill-in  |F'(C)|  is  labeled  for  the  two  elimination 
orderings.  This  example  illustrates  the  effect  of  elimination  ordering  on  the  total 
complexity  of  the  solution  of  a  system  Gaussian  elimination.  However,  finding 


an  elimination  ordering  that  produces  the  minimum  fill-in  for  a  given  graph  is  a 
problem  that  has  been  demonstrated  to  be  NP-complete  [13,32]. 

A  system  that  may  be  solved  with  no  fill-in,  (i.e.,  F{Ga)  =  0),  is  called  a 
monotone  transitive  graph  or  a  perfect  elimiruUion  graph.  It  can  be  observed  that 
the  edges  added  during  Gaussian  elimination  result  in  a  perfect  elimination  graph, 

g;  =  <v,(euf(g,))). 


.» f 


Rose  termed  this  the  monotone  transitive  extension  of  a  graph  and  also  characterized 
these  graphs  as  triangulated  graphs  [26].  A  triangulated  graph  is  one  in  which  every 
cycle  of  length  n  >  4  contains  a  chord.  Finding  the  minimum  set  of  additional 
edges  that  results  in  a  triangulated  graph  has  been  shown  to  be  equivalent  to  the 
problem  of  finding  the  best  elimination  ordering.  Even  though  this  observation 
provides  insight  into  the  problem,  it  leads  to  no  good  algorithms  for  the  selection 
of  an  elimination  ordering  [19]. 

2.3  Path  Problems  on  Graphs 

The  graph  representation  of  a  system  of  equations  is  more  thaui  merely  a  tool  for 
minimizing  fill-in.  The  single-source  path  problem  is  to  find,  for  all  vertices  v,  an 
expression  P(4,  v)  that  describes  all  paths  from  the  source  s  to  v.  Path  expressions 
are  built  from  the  operators  U  (union),  •  (concatenation),  and  *  (reflexive  transitive 
closure).  It  has  been  shown  that  the  solution  of  this  path  problem  on  graphs  also 
serves  as  a  solution  to  a  variety  of  problems  when  the  path  operators  are  mapped 
into  other  domains.  These  include  the  solution  of  systems  of  linear  equations,  data 
flow  problems,  and  shortest  path  problems  [29].  In  this  paper  only  those  path 
problems  where  the  desired  expressions  simply  give  the  conditions  under  which  a 
path  from  a  to  v  exists  are  considered. 

In  the  Boolean  algebra  of  thb  analysis  (J9,  V,  A,0,1),  the  operations  V  and 
A  denote  Boolean  “and"  and  “or,"  respectively,  applied  to  functions,  while  the 
distinguished  elements  0  and  1  denote  the  constant  functions  that  yield  0  2Lnd  1, 
respectively,  for  all  ugument  values.  This  algebra  is  sufficient  to  express  path 
expressions  for  a  simple  path  2malysis.  For  this  application  the  operators  described 
by  Tarjan  are  mapped  as  follows:  U  — »  V  aoid  •  —*■  A.  The  operator  •  evaluates  to 
1  for  all  Boolean  values  and  will  not  enter  into  the  discussion  herein. 

A  vertex  labeling  of  graph  G  =  {V,E)  is  an  assignment  i(v)  €  B  to  each  vertex 
V  €  V.  An  edge  labeling  is  an  assignment  A(u,v)  €  B  to  each  edge  (u,v)  €  E  . 
Gaussian  elimination  may  be  performed  directly  on  the  graph  G  to  solve  the  linear 
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system  Ax  =  6  when  this  system  is  recast  as 

A‘x  -h  6  =  X 

where  A'  =  I  —  A.  The  the  solution  to  this  system  is  the  same  as  that  for  its  other 
form,  but  the  reformulation  allows  the  solution  of  systems  of  equations  in  algebras 
that  do  not  have  an  inverse  under  the  concatenation  operation  (•).  In  particular, 
systems  of  Boolean  equations  do  not  have  an  inverse  under  A. 

The  Gaussian  elimination  algorithm  described  in  Figure  7  solves  the  single¬ 
source  path  problem  under  Boolean  algebra.  It  begins  by  placing  an  initial  label 
b{vi)  representing  a  Boolean  function  on  each  vertex.  Each  edge  (v,-,V;)  is  initially 
labeled  with  A^j.  Graph  structure  is  modified  as  described  earlier,  but  labels  are 
modified  as  well.  At  each  step,  before  a  vertex  is  eliminated,  its  label  is  propagated 
to  each  of  its  neighbors.  Likewise,  the  labels  of  fill-in  edges  are  modified.  Backsolv- 
ing  consists  of  a  similar  sequence  of  operations:  propagating  labels  back  to  vertices 
that  are  successively  added  to  the  graph.  At  the  termination  of  the  algorithm, 
each  vertex  is  labeled  with  an  expression  x(v,-)  that  satisfies  the  single-source  path 
problem  as  stated  above. 

The  complexity  of  this  algorithm  can  be  analyzed  in  terms  of  and-or  operations. 
Define  one  operation  to  be  of  the  form  (a  V  (6  A  c)).  During  forward  elimination 
two  sequences  of  operations  are  performed  for  each  vertex:  propagation  of  values 
to  neighbors  and  calculation  of  labels  of  fill-in  edges.  If  the  graph  is  directed,  the 
total  number  of  operations  performed  due  to  forward  elimination  is 

n  ^(x.) 


E 

*=x 


i=i 


Backsubstitution  results  in  the  propagation  of  labels  back  to  vertex  v,  from  eeurh  of 
its  d(v,)  neighbors.  Thus,  backsubstitution  contributes 


•=i 


operations.  The  total  complexity  for  performing  Gaussian  elimination  on  a  graph 
is  given  by 

n  <(•>.) 

y=» 


•  Zl 


i 
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{  Initial  Labelling  } 
for  t  =  1  to  n  do 
x(w,)  ^  b{vi) 
od 

{  Forward  Elimination  } 

Eq  E 

for  *  1  to  (n  —  1)  do 

Vi  ^  Vi.i  -  K,} 

Ei  Ei-i  n  [Vi  X  Vi\ 

for  each  u  €  Vi  such  that  (t;a,-,tt)  G  Ei-i  do 
x(u)  x(Wa.)  V  [x(Va.)  A  i4(Va.-,tt)l 

for  each  w  €  Vi  such  that  (u,  6  ^<-1  and  w  ^  do 

if  6  Ei  then  A(u,u;)  ♦—  A{u,w)  V  (A(u,Va,)  A  A(va,  ,tt;)] 

else  Ei  ^  EiU  {(«,«;)};  A(tt,u;)  *-  A(u,Va,)  A  A(t;a,.,tw) 
fi 
od 
od 
od 


{  Baeksubstitution  } 
for  *  ■«—  (n  —  1)  to  1  do 

for  each  «  €  such  that  («,Wa,)  €  Ei-i  do 
*(va.)  ^  x(t;a,)  V  [x(u)  A  A(tt,Va,)] 
od 
od 


CHAPTER  2.  GAUSSIAN  ELIMINATION 


18 


which  is  more  simply  expressed  as 

12  [d(t?<)*  +  d(t;<)] .  (1) 

t=i 

This  formula  reveals  that  the  total  complexity  is  extremely  sensitive  to  values 
of  <2(v,).  If  the  graph  can  be  operated  on  in  such  a  manner  that  d(v{]  is  bounded  by 
a  constant  for  all  t,  then  the  complexity  will  be  0(n).  However,  in  the  worst  case, 
when  G  is  the  complete  graph  of  n  vertices,  Kn,  complexity  will  be  0(n*). 

2.4  Choosing  an  Elimination  Ordering 

While  finding  an  elimination  ordering  that  results  in  the  miniTnum  total  operation 
count  is  an  NP-complete  problem,  heuristics  that  often  result  in  good  orderings  are 
available  [26].  One  heuristic  based  on  the  results  of  Equation  1  relates  the  total 
complexity  of  perfornung  Gaussian  elimination  to  the  elimination  degrees  of  the 
vertices.  A  method  called  the  minimum  degree  algorithm  produces  an  elimination 
ordering  by  selecting  the  vertex  with  the  minimum  degree  each  time  the  main  loop 
of  the  elimination  process  is  executed. 

This  method  is  computationally  efficient  and  will  often  find  good  elimination 
orderings.  For  some  classes  of  graphs  the  minimum  degree  algorithm  results  in  the 
elimination  ordering  that  gives  the  least  number  of  total  operations.  One  example 
is  the  class  of  graph  termed  General  Series  Parallel  (GSP).  This  class  of  graph 
expands  on  the  conventional  definition  of  series-parallel  graphs  to  include  graphs 
containing  acyclic  branches.  Graphs  of  this  class  may  be  constructed  inductively  by 
starting  with  a  single  vertex  and  applying  the  production  rules  given  in  Figure  8. 
Each  rule  adds  a  vertex  to  the  graph,  as  well  as  one  or  two  edges.  Examples  of  GSP 
graphs  for  a  complex  gate  that  might  arise  in  an  nMOS  circuit  and  a  shift  network 
are  provided  in  Figure  9.  The  shift  network  channel  graph  is  shown  redrawn,  to 
better  exhibit  its  GSP  structure. 

The  importance  of  GSP  graphs  is  that  they  arise  often  in  the  analysis  of  MOS 
networks  [7].  Graphs  of  this  type  may  be  eliminated  efficiently  by  finding  the  reverse 
of  the  sequence  of  production  rules  that  constructed  them.  It  is  clear  that  since  each 
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Figure  9:  Examples  of  GSP  Graphs 
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Figure  10:  Non-GSP  Graphs 

production  rule  adds  one  vertex  adjacent  to  at  most  two  others,  the  Tninimum  degree 
algorithm  will  effectively  find  the  reverse  sequence  of  production  rules  applied.  The 
algorithm  also  results  in  a  low  total  complexity  since  at  each  step  an  eliminated 
vertex  will  have  (at  most)  elimination  degree  (i(v,)  =  2.  Thus,  by  Equation  1  the 
total  complexity  of  solving  a  system  described  by  a  GSP  graph  will  be: 

+  <^(w.)]  =  6n  (2) 

f=i 

Other  graphs  that  arise  in  the  analysis  of  MOS  networks  have  low  elimination 
degrees  even  though  they  may  not  be  GSP.  Two  examples  are  shown  in  Figure  10. 
These  graphs  may  be,eliminated  effectively  by  the  minimum  degree  algorithm  with 
no  vertex  having  d(v)  >  3.  Thus,  the  minimum  degree  algorithm  leads  to  low  total 
complexity  for  these  graphs  as  well. 

The  minimum  degree  algorithm  is  an  example  of  a  greedy  algorithm  in  that  it 
chooses  vertices  without  regard  to  future  eliminations  of  vertices.  Another  greedy 
algorithm  is  the  minimum  deficiency  algorithm.  At  each  step,  a  vertex  is  chosen 
such  that  D{v)  is  minimized.  For  GSP  graphs,  the  minimum  deficiency  algorithm 
will  find  a  reverse  sequence  of  production  rules  in  a  manner  similar  to  the  minimum 
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degree  algorithm.  Each  Acyclic  and  Parallel  production  rule  leads  to  a  vertex  with 
D{v)  =  0,  and  each  Series  rule  introduces  a  vertex  with  D{v)  =  1.  Thus,  at  eacii 
elimination  step,  a  vertex  with  D{v)  =  1  or  0  may  be  found.  In  practice,  this 
algorithm  often  produces  an  ordering  equivalent  to  the  minimum  degree  algorithm 
for  GSP  graphs  and  has  the  same  bound  on  elimination  complexity. 

A  drawback  of  the  minimum  deficiency  algorithm  is  that  it  is  more  computa¬ 
tionally  intensive  than  the  minimum  degree  algorithm.  Calculating  the  deficiency 
of  a  vertex  involves  examining  all  of  its  pairs  of  neighbors.  Furthermore,  if  the 
implementation  is  not  clever,  this  calculation  will  be  performed  for  all  uneliminated 
vertices  at  each  step  in  the  elimination  process.  An  advantage  of  this  algorithm 
over  the  mininoum  degree  algorithm,  however,  is  that  for  arbitrary  graphs  that  are 
triangulated,  a  perfect  elimination  ordering  will  be  found  [26].  The  minimum  degree 
algorithm  is  not  guaranteed  to  find  such  an  ordering. 

This  chapter  introduced  Gaussian  elimination  and  the  importance  of  finding  a 
good  elimination  ordering.  The  ordering  algorithms  examined  were  greedy  in  nature 
and  were  performed  during  the  elimination  process.  In  Chapter  3,  algorithms  which 
produce  an  ordering  before  elimination  begins  and  which  guarantee  good  asymptotic 
complexity  for  certain  classes  of  graphs  will  be  examined. 


Chapter  3 


Generalized  Nested  Dissection 


This  chapter  will  discuss  a  more  complicated  method  of  finding  an  elimination 
ordering  called  nested  dissection  which  was  first  proposed  for  grid  graphs  that  arise 
in  finite  element  analysis  [14].  The  algorithm’s  basic  idea  is  to  use  a  “divide  and 
conquer”  strategy  on  the  graph.  Removal  of  a  set  of  vertices  results  in  two  new 
graphs  on  which  Gaussian  elimination  may  be  performed  separately.  The  results 
for  the  two  parts  may  then  be  combined  to  find  the  solution  for  the  entire  graph. 
This  method  has  been  shown  to  result  in  good  elimination  orderings  for  certain 
classes  of  graphs. 

An  observation  about  the  results  produced  by  nested  dissection  algorithms  is 
that  the  path  expressions  generated  in  Gaussian  elimination  tend  to  exhibit  a  bal¬ 
anced  structure  in  which  a  parallel  evaluation  strategy  may  be  used  [23,24].  The 
importance  of  balanced  expressions  was  discussed  briefiy  in  Chapter  1.  Later  in 
this  chapter,  the  structure  of  expressions  will  be  analyzed  in  more  detail. 


3.1  Graph  Separators 

A  separator  of  a  graph  is  a  relatively  small  set  of  vertices  whose  removal  causes  the 
graph  to  fall  apart  into  a  number  of  smaller  pieces.  If  5  is  a  class  of  graph,  n  is 
the  number  of  vertices  in  a  graph,  and  p(n)  is  some  function  of  n,  then  5  satisfies  a 
p{n)-separator  theorem  if  there  are  constants  a  <  1  and  0  >  0  such  that  a  separator 
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Figure  11:  A  1-Separator  for  a  TVee 

set  with  at  most  0p{n)  vertices  separates  the  graph  into  components  with  at  most 
an  vertices  each. 

Most  algorithms  based  on  separators  are  recursive,  first  finding  a  separator  for 
the  whole  graph  and  then  finding  separators  for  the  components.  For  these  algo¬ 
rithms  to  work  on  a  graph  of  class  5,  all  subgraphs  of  this  graph  must  also  be 
of  class  5.  When  a  class  satisfies  this  requirement  we  say  that  5  is  closed  under 
subgraph. 

Binary  trees  are  a  class  of  graph  that  is  closed  under  subgraph;  separation  at 
any  vertex  separates  the  graph  into  two  smaller  binary  trees.  Figure  11  shows  a 
binary  tree  and  its  decomposition  into  two  smaller  trees  at  a  single  vertex  separator. 
This  theorem  is  stated  for  binary  trees  (16]: 

Theorem  1  The  class  of  binary  trees  satisfies  a  1-separator  theorem  for  a  =  |  and 

0  =  1. 

A  planar  graph  is  one  which  can  be  drawn  on  a  plane  so  that  the  edges  of  the 
graph  only  intersect  at  their  endpoints(5].  For  planar  graphs,  the  following  theorem 
is  taken  from  Lipton  and  Tarjan  [2lj. 

Theorem  2  The  class  of  planar  graphs  satisfies  a  y/n-separator  theorem  for  a  =  | 
and  0  =  2y/2. 

In  more  recent  work,  Djidjev  proved  that  the  theorem  also  holds  for  0  =  [10]. 
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Series-pax&llel  graphs  have  long  been  used  to  represent  simple  electric  net¬ 
works  [llj.  Such  graphs  may  be  constructed  by  the  application  of  series  and  parallel 
production  rules.  These  graphs  obey  the  following  theorem. 

Theorem  S  The  class  of  series-parallel  graphs  satisfies  a  8-separator  theorem  for 
a  =  I  and 

This  is  accepted  as  a  folk  theorem.  We  will  defer  its  proof  until  the  next  chapter, 
which  is  devoted  to  series-parallel  graphs. 

These  theorems  are  presented  to  provide  examples  of  the  types  of  separators 
that  have  been  shown  to  exist,  and  lead  to  algorithms  for  finding  separators  for 
limited  classes  of  graphs  (i.e.,  binary  trees,  planar  graphs).  Except  for  the  simplest 
cases,  finding  separators  is  a  non-trivial  problem  and  no  good  algorithms  exist  for 
finding  separators  greater  than  two  in  site  for  arbitrary  graphs. 


3.2  Elimination  Ordering  Algorithms 

Many  variations  of  elimination  ordering  algorithms  are  based  on  nested  dissection. 
These  algorithms  have  the  following  basis  as  a  common  starting  point.  The  main 
differences  involve  the  separators  found  for  different  classes  of  graphs  and  the  re¬ 
sulting  complexity  bounds. 

Given  a  graph  G  with  n  vertices,  partition  the  graph  into  parts  C.  Ai,  Ai,  etc., 
such  that  C  is  a  separator  of  the  graph.  Number  the  vertices  in  C  from  n  down  to 
(n  —  \C\  -h  1)  so  that  they  are  eliminated  last  from  the  graph.  Recursively  number 
the  elements  of  each  of  the  remaining  parts  of  G,  (Xi,  Aj,  ■  •  •)  from  1  to  (n  -  |C|). 
The  procedure  continues  until  all  vertices  are  numbered.  Typically,  the  recursion 
will  cease  when  the  size  of  a  set  reaches  some  small  threshold  value,  no,  in  which 
case  the  vertices  of  the  set  are  arbitrarily  assigned  numbers  in  the  given  range. 

3.2.1  Nested  Dissection 

Alan  George  proposed  the  first  nested  dissection  algorithm  [14].  It  was  defined  only 
for  grid  graphs  for  which  there  are  simple  separators.  Figure  12  shows  a  grid  graph 
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Figure  12:  A  Grid  Graph  of  Size  n  =  k  x  k  With  Separator  Set  Indicated 

with  n  =  kx  k.  Removal  of  the  middle  column  and  middle  row  separates  the  graph 
into  four  subgraphs.  The  algorithm  is  as  follows.  Assume  that  k  is  one  less  than  a 
power  of  two. 

•  Remove  row  (Ic  +  l)/2  and  column  {k  +  l)/2.  Give  the  highest  numbers  to 
these  2Ac  —  1  vertices. 

•  There  are  now  four  components  of  the  original  graph.  If  their  sizes  are  greater 
than  one,  recursively  number  the  components.  Otherwise,  number  the  four 
vertices  in  the  range  specified. 

Graphs  where  k  is  not  one  less  than  a  power  of  two  may  be  handled  by  adding 
some  number  of  “dummy"  vertices.  This  algorithm  is  well  suited  for  finite  element 
mesh  analysis  and  results  in  O(nlgn)  fill-in  and  0(nt)  total  operation  count. 
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3.2.2  Automatic  Nested  Dissection 

This  algorithm  is  defined  for  any  arbitrarily  connected  graph  [15].  A  graph  may 
be  partitioned  into  levels  by  performing  a  breadth-first  search  beginning  at  some 
starting  vertex  v.  Each  level,  Lo,..-,Lr  is  a  partition  of  the  graph.  Lo  contains 
vertex  v,  Li  contains  those  vertices  adjacent  to  v,  L2  contains  those  vertices  two 
edges  away  from  v,  etc.  Each  level  is,  to  some  extent,  a  separator  of  the  graph.  The 
algorithm  is  as  follows: 

•  Partition  the  graph  into  levels  Lo,  •  •  • « L,  by  performing  a  breadth-first  search 
on  the  graph. 

•  The  vertices  of  level  s  =  [(r  -f-  l)/2j  separate  the  graph.  Choose  a  minimal 
subset  of  L«  that  is  still  a  separator  and  assign  the  highest  numbers  to  these 
vertices. 

•  Recursively  number  each  component  (there  may  be  more  than  two)  whose 
size  is  greater  than  no. 

It  has  been  suggested  that  this  algorithm  results  in  0(n  Ig  n)  fill-in  for  a  number 
of  finite  element  meshes.  While  it  may  have  asymptotic  complexity  approaching  this 
limit,  we  have  observed  better  performance  from  the  minimum  degree  algorithm 
described  in  Section  2.4.  Table  1  shows  the  results  we  have  obtained  for  a  variety 
of  square  grid  graphs  of  size  y/n  x  y/n. 

3.2.3  Generalized  Nested  Dissection  -  I 

This  algorithm  was  derived  by  Lipton,  Rose  and  Tvjan  [20] .  Given  a  graph  G  of 
class  S  that  obeys  an  /(n)-separator  theorem  with  constants  a  and  P,  partition  the 
graph  into  three  parts  A,  B,  and  C,  such  that  C  is  a  separator  of  the  graph  with 
no  more  than  0f{n)  vertices. 

•  If  there  are  no  more  than  no  =  (i^)*  vertices,  number  them  arbitrarily  in 
the  range  specified. 
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Grid 

Minimum  Degree 

Automatic  Nested 

Size 

Algorithm 

Dissection 

(») 

fill-in 

complexity 

fill-in 

complexity 

16 

18 

172 

22 

200 

25 

37 

350 

46 

420 

36 

71 

678 

100 

1032 

49 

122 

1228 

165 

1824 

64 

178 

1840 

240 

2822 

81 

280 

3198 

321 

4010 

100 

376 

4442 

454 

5996 

121 

487 

6078 

612 

8556 

144 

649 

8662 

819 

12086 

Table  1:  Comparison  of  Minimum  Degree  and  Automatic  Nested  Dissection 

•  Find  sets  A,  B,  and  C  that  satisfy  the  ^/n-separator  theorem  where  C  is  the 
separator  set. 

•  Assign  the  vertices  of  C  the  highest  numbers. 

•  Delete  all  edges  whose  endpoints  are  both  in  C.  Apply  the  algorithm  recur¬ 
sively  to  the  subgraph  induced  by  BuC  and  also  to  the  subgraph  induced  by 
AUC. 


The  vertices  of  C  are  included  in  the  recursive  call;  therefore  some  method  of 
recording  which  vertices  have  already  been  numbered  is  required.  This  algorithm 
was  analyzed  in  particular  detail  for  classes  of  graph  that  follow  a  v^separator 
theorem.  From  a  result  of  Lipton  and  Tarjan  we  know  that  planar  graphs  satisfy 
a  '^/n-separator  theorem  [21].  The  ordering  produced  by  this  algorithm  will  result 
in  0(n  Ign)  fill-in  and  0(nt)  total  operation  count  for  plamar  graphs,  although 
the  coefficients  of  the  actual  fill-in  and  operation  counts  are  very  large.  However, 
the  authors  believe  that  their  worst-case  bounds  are  very  pessimistic  and  that  the 
algorithm  would  be  useful  for  very  large  graphs.  Applications  of  such  a  theorem 
include  finite  element  meshes  (which  are  planar  embeddings  of  a  planar  graph)  and 
GSP  graphs,  which  also  belong  to  the  class  of  planar  graphs. 
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3.2.4  Generalized  Nested  Dissection  -  II 

A  variation  to  the  generalized  nested  dissection  algorithm  of  the  preceding  section 
has  been  proposed  for  separators  that  divide  the  graph  into  more  than  two  pieces, 
A  and  B  [16].  This  algorithm  assumes  that  the  separator  C  splits  the  graph  into 
pieces  Ai,  Aj,  •  •  • ,  A,.  A  sep2irate  recursive  call  is  made  for  each  part,  A,-,  1  <  t  <  r. 

•  If  there  are  no  more  than  no  vertices,  then  simply  number  the  vertices  jurbi- 
trarily  in  the  range  given. 

•  Find  a  separator  with  k  <  0y/n  vertices  that  divides  the  graph  into  pieces 
•^ii -^Ji  •  •  •  1  A,,  where  |A,|  <  an.  Number  the  vertices  of  C  arbitrarily  from 
(n  —  IC]  +  1)  to  n. 

•  Call  the  algorithm  recursively  r  times  for  each  component  A,,  1  <  t  <  r  to 
number  the  remaining  vertices  in  the  range  between  1  and  n  —  |C|. 

This  algorithm  differs  from  the  previous  one  in  that  it  does  not  include  the 
vertices  of  C  in  the  recursive  call.  Also,  by  recurslng  on  more  than  two  subgraphs 
at  each  level  it  does  not,  in  general,  result  in  the  same  bounds  for  fill-in  and  total 
operation  coimt.  However,  for  planar  graphs,  the  same  bounds  are  met.  For  other 
classes  of  graph  with  '^/n-separator  theorems  it  may  even  perform  better  [16|.  For 
the  rest  of  this  report,  the  term  “generalized  nested  dissection”  will  be  considered 
to  refer  to  this  variation. 

3.3  Separator  Trees 

The  recursion  of  the  algorithms  described  above  suggests  a  natural  decomposition  of 
graphs  in  terms  of  their  separators.  At  the  highest  level  is  a  separator  that  divides 
the  graph  into  components.  These  components  themselves  L^ive  separators,  and  so 
on.  At  the  lowest  level  are  components  that  may  be  divided  no  further,  possibly 
containing  only  a  single  vertex.  This  decomposition  may  be  described  in  terms  of 
a  structure  called  a  separator  tree. 
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Figure  13:  A  Graph  and  Its  Separator  Tree 

A  separator  tree  for  a  graph  is  shown  in  Figure  13.  Such  trees  graphically  show 
how  separators  arise  from  a  graph;  they  also  reveal  where  fill-in  may  occur.  The 
following  theorem  is  from  [27]. 

Theorem  4  Let  G  =  (V,  F?,a)  be  an  ordered  graph.  Then  (v,u;)  is  an  edge  of  G’ 
if  and  only  if  there  exists  a  path  M  =  [v  =  uj,  •  •  • ,  Vk+i  =  lu]  in  Ga  such  that 

<  mm(a~*(t;),a~^(u;))  for  2  <  i  <  k. 

This  theorem  states  that  am  edge  (v, «;)  fills  in  if  and  only  if  there  is  a  path  from  v 
to  w  containing  only  vertices  eliminated  before  either  v  or  w. 

Theorem  4  may  be  used  to  calculate  bounds  on  fill-in  due  to  a  nested  dissection 
algorithm.  Consider  a  node  of  the  separator  tree  C,  amd  its  subtrees  Aj,  A2,  •  •  • ,  A„. 
Since  there  are  no  paths  between  A,-  and  A,-  initially  (for  any  two  subtrees  where 
i  ^  j),  and  the  elements  of  C  are  given  higher  elimination  numbers  than  those  in 
Ai  and  Ay,  there  may  not  be  a  fill-in  edge  between  ainy  member  of  A,-  and  A,.  Thus, 
the  sepau'ator  tree  shows  that  the  only  possible  fill-in  that  may  occur  is  along  the 
edges  of  the  tree,  or  between  the  vertices  of  an  individual  node  of  the  tree.  This 
fact  may  be  used  to  calculate  bounds  on  the  total  amount  of  fill-in  using  a  nested 
dissection  ordering  algorithm  for  some  classes  of  graph. 
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3.4  Complexity  Versus  Structure 

Nested  dissection  algorithms  are  based  on  a  divide-ajid-conquer  strategy  that  suc¬ 
cessively  splits  the  graph  into  smaller  subproblems  to  be  solved  independently.  In¬ 
formation  about  the  smaller  problems  Is  then  combined  to  find  the  solution  of  the 
whole.  The  discussion  on  separator  trees  showed  how  this  technique  could  lead  to 
bounds  on  the  amount  of  fill-in  that  occurs  amd  also  boiinds  on  total  complexity. 

The  divide-and-conquer  approach  also  leads  to  path  expressions  that  exhibit 
bailance.  Gaussian  elimination  can  be  viewed  as  the  propagation  of  information 
through  a  graph.  Path  expressions  are  built  by  propagating  information  about 
the  graph  to  one  point  amd  then  propagating  the  information  back  to  all  points. 
The  structure  of  the  expressions  generated  depends  on  how  the  information  flows 
through  the  graph,  which  is  determined  by  the  elimination  ordering. 

In  Chapter  1,  a  simple  graph  as  shown  in  Figure  2  was  eliminated  with  two 
different  orderings  of  vertices.  Elimination  of  vertices  from  left  to  right  resulted 
in  the  DAG  of  Figure  3,  while  a  different  elimination  ordering  resulted  in  a  more 
balanced  DAG,  that  of  Figure  4.  The  elimination  ordering  of  the  second  can  be  seen 
to  be  one  that  would  be  produced  by  a  generalized  nested  dissection  algorithm, 
eliminating  the  middle  vertex  (a  separator)  last. 

A  graph  resembling  a  binary  tree  b  shown  in  Figure  14.  The  minimum  degree 
algorithm  could  result  in  a  number  of  different  elimination  orderings,  one  of  them 
as  shown  in  Figure  15.  The  resulting  DAG  b  shown  as  well.  Once  again,  this  DAG  is 
not  balanced,  but  b  "long  and  skinny.”  A  generalized  nested  dissection  algorithm 
would  guarantee  elimination  of  the  vertices  in  an  order  similar  to  that  depicted  in 
Figure  16.  The  resulting  DAG  for  thb  ordering  b  shown  as  well.  It  b  obvious  that 
thb  DAG  b  more  balanced,  just  as  the  DAG  of  Figure  4.  We  can  say  that  this  DAG 
b  “short  and  fat.” 

In  many  situations,  these  path  expressions  of  the  DAG  are  calculated  symbolically 
(as  shown)  and  used  repeatedly  to  solve  problems  that  involve  different  labelings 
of  graphs  with  the  same  structure.  A  long,  skinny  expression  tree  means  that  all 
operations  must  be  performed  sequentially,  whereas  a  more  balanced  tree  shows  a 
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Figure  14:  A  Graph  Resembling  a  Binary  Tree 

level  of  independence  aunong  the  operations.  An  appropriate  architecture  may  take 
advantage  of  this  independence  by  using  multiple  processors  to  exploit  fine-grained 
parallelism. 

In  general,  the  expressions  generated  by  Gaussian  elimination  using  nested  dis¬ 
section  ordering  algorithms  are  balanced,  and  hence  have  small  depth,  if  there  are 
good  separators  for  the  graph.  In  the  following  chapters  algorithms  that  lead  to 
the  identification  of  separators  for  a  variety  of  graphs  will  be  presented.  The  extent 
to  which  they  separate  the  graph  will  be  shown  to  lead  to  an  upper  bound  on  the 
depth  of  expressions  generated. 
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Figure  16:  The  Resulting  DAG  With  an  Ordering  Due  to  Nested  Dissection 


Chapter  4 


Separators  for  Series-Parallel 
Graphs 


In  this  chapter  some  properties  of  a  class  of  graph  called  Series-Parallel  (SP)  are 
examined..  This  class  of  graph  is  often  used  to  describe  electrical  networks,  since 
many  networks  are  composed  of  only  series  aind  parallel  connections  of  elements. 
Reduction  rules  for  SP  graphs  are  developed  first.  It  is  then  shown  how  these  can 
be  used  to  find  separators  for  these  graphs. 

4.1  Series-Parallel  Reduction  Rules 

SP  graphs  may  be  described  by  the  composition  of  their  edges  in  terms  of  Series, 
S,  and  Parallel,  P,  reduction  rules.  The  two  rules  are  shown  in  Figure  17.  Each  of 
the  rules  results  in  the  reduction  of  the  number  of  edges  in  the  graph  by  one.  By 
the  application  of  a  sequence  of  these  reduction  rules,  it  is  possible  to  reduce  a  SP 
graph  to  a  single  edge.  For  SP  graphs  the  following  applies. 

Theorem  5  A  multigraph  is  SP  if  and  only  if  it  can  be  reduced  to  the  trivial  SP 
graph  (two  vertices  joined  by  a  single  edge)  by  a  sequence  of  series  and  parallel 
reductions. 

This  is  a  trivial  generalization  of  a  corollary  proposed  by  Duffin  (ll].  Thus,  the 
application  of  these  reduction  rules  may  be  used  to  test  if  a  graph  is  indeed  SP. 

34 
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Figure  17:  Series  and  Pairallel  Reduction  Rules 

There  may  be  many  different  combinations  of  the  rules  that  result  in  a  single  edge, 
all  of  which  being  valid  reductions  of  the  graph.  This  paolicular  set  of  reduction 
rules  has  been  shown  to  possess  the  Chureh-Rosatr  property  [30]  which  means  that 
the  order  in  which  the  rules  are  applied  will  not  affect  the  possiblity  of  reducing  the 
graph  to  a  single  edge. 

The  manner  in  which  the  reduction  rules  are  applied  to  a  graph  can  be  described 
by  a  binary  decomposition  tree  [31].  Such  a  tree  is  analagous  to  a  binary  expression 
tree,  where  the  operators  represent  functions  performed  on  the  edges  of  the  graph. 
Figure  18  illustrates  the  construction  of  a  decomposition  tree  for  a  simple  SP  graph 
where  endpoints  are  used  to  identify  the  edges.  A  decomposition  tree  may  be  built 
by  labeling  the  edges  of  the  graph  for  each  reduction  where  the  labeling  procedure 
is  described  in  Figure  19.  The  initial  labels  of  an  edge  are  assumed  to  be  the  set  of 
its  endpoints. 

It  is  interesting  to  note  that  if  the  P  and  S  operators  axe  interpreted  as  the 
Boolean  operators  V  and  A,  the  decomposition  tree  gives  all  paths  between  the 
two  remaining  vertices  in  much  the  same  manner  that  the  expressions  generated  by 
Gaussian  elimination  describe  paths  for  all  pair  of  vertices.  In  fact,  the  reduction 
procedure  described  is  identical  to  the  forward  elmination  step  of  Gaussian  elimi¬ 
nation  where  vertices  are  intially  labeled  with  x(v,]  =  0  and  the  ordering  algorithm 
used  is  simply  the  selection  of  some  vertex  with  d(v,')  =  2. 
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(1.2)  (2.5)  (3.5)  (3.4) 


Figure  18:  Construction  of  a  Decomposition  Tree  for  a  SP  graph 
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Figure  19:  Edge  Labeling  for  the  Reduction  of  SP  Graphs 

4.2  Separators  of  SP  Graphs 

Earlier  a  theorem  that  SP  graphs  have  2-separator8  for  a  =  |  was  provided  (Theo¬ 
rem  3).  In  this  section  it  is  shown  that,  with  a  proper  labeling  of  the  decomposition 
tree  of  a  SP  graph,  such  separators  may  be  easily  found. 

Each  S  node  in  the  decomposition  tree  represents  the  removal  of  one  vertex 
from  the  graph.  Each  S  node,  n,  will  be  assigned  an  internal  weight  t(n)  =  1  to 
reflect  this.  Conversely,  a  P  rule  represented  the  removal  of  no  vertex,  but  only  the 
removal  of  an  edge.  Each  P  node  will  be  given  an  internal  weight  i{n)  =  0. 

Each  node  n  in  the  tree,  either  S  or  P,  represents  a  separator  of  the  graph.  Its 
total  weight 

«'(»*)  =  II  *(p) 

p€d*it*n^nti{n] 

gives  the  number  of  vertices  separated  from  the  rest  of  the  graph  by  this  separator. 
A  mapping  between  nodes  in  the  tree  and  separators  may  be  easily  constructed  by 
the  proper  labeling  of  each  node  in  the  tree  with  an  additional  field.  Each  leaf  node 
represents  an  edge,  (vi,V]).  A  Series  combination  of  two  SP  subgraphs  defined  by 
their  endpoints  (vi,V2)  and  (v],vs),  results  in  the  separator  set  (vi,vs)  that  serves 
to  separate  everything  internal  to  (vt,vs)  from  the  rest  of  the  graph.  Parallel  rules 
always  combine  two  subgraphs  whose  endpoints  are  the  same.  By  labeling  each 
node  of  the  tree  with  the  endpoints  of  the  subgraphs  they  represent,  each  node 
will  be  labeled  with  a  separator  pair.  Figure  20  shows  the  labeling  operations  for 
series  and  parallel  rules.  By  applying  these  from  the  bottom  up  each  node  in  a 
decomposition  tree  may  be  labeled  with  its  separator  set  and  total  weight.  The  tree 
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Figure  20:  Labeling  for  Each  Node  in  a  Decomposition  Tree 


w.s 


Figure  21:  A  Decomposition  IVee  After  Labeling 

of  Figure  18  is  shawm  again  in  Figure  21  with  each  node  labeled  with  the  proper 
total  weight  w  and  its  separator  set  5. 

It  should  be  noted  that  the  weight  of  the  root  of  the  decomposition  tree  is  two 
less  than  the  totaJ  number  of  vertices  in  the  graph.  This  is  because  the  weight  of  each 
node  represents  how  many  vertices  would  be  separated  from  the  rest  of  the  graph, 
not  including  the  two  vertices  of  the  separator  set.  The  algorithm  of  Figure  22  is 
a  direct  implementation  of  the  general  step  of  the  constructive  proof  and  finds  a 
2-separator  in  a  graph  for  which  a  decomposition  tree  has  been  constructed.  In  the 
next  section  this  fimction  will  be  used  as  part  of  another  algorithm  for  computing 
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faction  Find2Separator(  tree  T,  weight  W)  returns  tree 
If  w{T)  <  np  then  return  T 

else 

If  w(LSON(T))  <  ty(RSON(7’))  then 
Fmd2Separator(RSON(r),  W) 

else 

Fiiid2Separator(LSON(T) ) 

fl 

fl 

end  Fmd2Separator; 


Figure  22:  Recursive  Algorithm  for  Finding  2-SepaLrator  of  a  SP  Graph 

an  elimination  ordering  for  a  SP  graph  based  on  nested  dissection.  At  this  point 
the  theorem  that  SP  graphs  have  2'Separators  with  o  =  |  will  be  proved. 

1 

* 

PROOF:  Nodes  are  examined  in  the  decomposition  tree  starting  at  the  root  and 
descending  through  the  tree.  The  general  step  of  the  proof  will  eventually  accept 
a  node  as  a  representing  a  valid  separator  of  the  SP  graph.  W  represents  not  the 
total  weight  of  the  tree,  but  the  number  of  nodes  in  the  graph.  For  each  node  in  the 
tree  there  is  a  separator  set  of  size  two  that  separates  the  vertices  of  its  descendent 
rules  from  all  others  in  the  graph.  It  is  obvious  that  some  node,  nl,  whose  weight  is 
less  than  or  equal  to  |(V  may  be  found  but  it  remains  to  be  shown  that  the  weight 
of  everything  else  in  the  tree,  (W  -  [«/(nl)  +  2]),  is  also  <  \W. 

s 

> 

J 

J 

J 

INITIALIZATION:  nl  is  the  root  of  the  tree. 

i 

*1 

INVARIANT:  W  —  (w(nl)  +2)  <  |W.  This  is  initially  true  when  nl  is  the  root  since 
W  =  u;(nl)  +  2. 

General  Step:  Find  the  child  of  nl  with  the  greatest  weight.  Call  this  node 
n2.  If  u;(n2)  >  \W  then  repeat  the  general  step  with  nl  <—  n2.  The  invariant  still 
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• 

holds. 
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Otherwise,  n2  is  a  node  that  represents  a  separator  set  of  size  two  that  divides 
the  graph  as  required.  The  weight  of  n2,  u;(n2),  is  small  enough;  it  remains  to  be 
shown  that  the  weight  of  the  rest  of  the  graph  without  the  separator  set  is  small 
enough  as  well. 

It  is  known  that  w(nl)  >  \W .  Also,  since  each  node  in  the  tree  hzis  two  children 
and  (at  most)  internal  weight  1,  w{n2)  x  2  +  1  >  u;(nl).  Therefore, 

«;(n2) 

The  number  of  vertices  in  the  rest  of  the  graph  is  given  by  (W  —  (u;(n2)  +2]).  Using 
the  previous  relation  we  get 

(IV  -  Kn2) +2|)  <  llV  -  ? 

which  Indicates  that  both  parts  are  small  enough. 

□ 


4.3  An  Elimination  Ordering  for  SP  Graphs 

The  method  of  finding  separators  for  SP  graphs  discussed  above,  can  be  used  to 
produce  an  elimination  ordering  using  a  procedure  based  on  generalized  nested 
dissection.  While  SP  graphs  can  be  eliminated  efficiently  using  the  minimum  de¬ 
gree  algorithm,  we  C2m  expect  the  nested  dissection  elimination  ordering  to  yield 
expressions  that  will  exhibit  balance,  and  whose  depth  will  be  small. 

An  obvious  implementation  of  a  nested  dissection  algorithm  would  proceed  as 
follows: 

•  If  there  are  only  one  or  two  vertices,  number  these  in  the  rainge  specified. 

•  Build  a  decomposition  tree  for  the  SP  graph. 

•  Find  a  sepuator  for  the  SP  graph  with  decomposition  tree  given  by  the  algo¬ 
rithm  of  Figure  22. 


Number  these  two  vertices  the  highest. 
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•  Elimination  of  the  separators  results  in  two  new  SP  graphs.  Recursively  num¬ 
ber  these. 

This  would  not  be  very  efficient  however,  since  it  would  require  the  building  of 
a  decomposition  tree  for  each  subgraph.  A  key  observation  of  the  decomposition 
tree  is  that  the  removal  of  a  node  splits  the  tree  into  two  trees  that  describe  the 
subgraphs  of  the  original  graph  defined  by  the  separator.  The  recursive  algorithm 
presented  in  Figure  23  produces  an  equivalent  elimination  ordering  by  relabeling 
fields  of  the  modified  decomposition  tree  instead  of  rebuilding  a  tree  each  time. 

The  algorithm  assumes  that  a  decomposition  tree  has  been  produced  and  cor¬ 
rectly  labeled  as  described  earlier.  The  vsu'iable  n  is  assumed  to  contain  the  total 
number  of  vertices  in  the  graph,  and  global  variables  ElimCnt  and  ElimNums 
are  used  by  the  procedure  AssignNumber  for  the  numbering  of  vertices.  Another 
function  called  Label  is  responsible  for  recalculating  the  weights  of  the  entire  tree 
when  a  subtree  has  been  removed. 

Removal  of  a  node  from  the  tree  may  be  reflected  by  simply  marking  its  weight 
field,  to  zero.  The  procedure  Label  does  not  descend  past  nodes  with  zero  weight. 
Upon  completion  of  the  algorithm,  the  vertices  of  the  graph  will  be  numbered  in  an 
order  that  represents  nested  dissection  for  a  SP  graph. 

Applying  the  elimination  ordering  produced  by  the  algorithm  of  Figure  23  to 
Gaussian  elimination  to  a  system  of  Boolean  equations  represented  by  the  graph 
of  Figure  24,  the  DAG  of  Figure  25  is  obtained.  For  comparison,  the  results  of 
the  minimum  degree  algorithm  are  shown  as  well  in  Figure  26.  As  expected,  the 
generalized  nested  dissection  algorithm  for  SP  graphs  yields  expressions  with  a  small 
depth  (13),  while  the  minimum  degree  algorithm  resulted  in  expressions  of  depth 
20. 

The  algorithm  proposed  for  finding  separators  of  SP  graphs  is  interesting  in  that 
it  closely  resembles  the  forward  elimination  part  of  Gaussian  elimination.  In  the 
next  chapter,  this  technique  will  be  expanded  to  handle  graphs  that  are  not  SP. 
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type  nodetype  =  (Series, Parallel); 

type  tree  =  pointer  to  record 

verts  si,  82;  {The  vertices  of  the  separator  set} 

int  w;  {  The  weight  of  the  node} 

tree  LSON,  RSON;  {The  two  children  of  this  node} 

nodetype  mle;  {  Whether  this  node  represents  a  Series  or  Parallel  rule} 

endrecord; 

{Global  variables} 
int  ElimCnt  =n; 

array  of  int  £;/imJVum3(l..n]  =  {0,0,---,0}; 

procedure  AssignNumber(vert  v) 
if  £/tmArums[v]  =  0  then 

^/im^ums[w]  *—  ElimCnt', 

ElimCnt  «—  ElimCnt  —  1; 
fi 

end  AssignNumber; 

function  Label  (tree  T)  returns  int 
If  T.w  =  0  then  return  0; 
else 

r.w  =:  Label(r.LSON)  +  Ubel(T.RSON); 
if  T.rule  =  Series  then  T.wT.w  +  1;  fi 
return  T.w; 
fi 

end  Label; 

procedure  ConiputeOrdering(  tree  T) 
if  T.vj  =  0  then 

AssignNumber(T.8l); 

A8signNumber(T.82); 

else 

t «—  Find2Separator(T); 

AssignNumber(t.8l); 

AssignNiimber(t.s2) ; 
t.w  0; 

Label(r); 

ComputeOrdermg(T] ; 

ComputeOrdermg(t); 

fi 

end  ComputeOrdering; 


Figure  23:  Nested  Dissection  Elimination  Ordering  Algorithm  for  SP  Graphs 
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Figure  24:  A  Szunple  Graph 


Chapter  5 


Separators  for  Arbitrary  Graphs 


The  identification  of  separators  in  arbitrary  graphs  is  a  difficult  problem.  While 
there  are  good  algorithms  for  finding  bi-connected  [3]  and  tri-connected  [18]  compo¬ 
nents  in  graphs  that  lead  to  the  identificaton  of  1-  and  2-separators,  and  algorithms 
that  find  separators  for  certain  classes  of  graph,  (e.g.,  plainar  graphs),  there  are  no 
good  algorithms  for  finding  separators  in  arbitrary  graphs.  Identification  of  only 
the  1-  and  2-sepaj‘ators  in  a  graph  is  not  sufficient,  and  arbitrary  graphs  are  not 
guaranteed  to  have  them  anyway.  A  more  general  method  of  finding  separators  for 
use  with  a  nested  dissection  ordering  algorithm  is  desired. 

This  chapter  will  develop  a  method  for  finding  separators  in  airbitrary  graphs 
based  on  decomposition  trees.  Reduction  rules  will  be  formulated  based  on  the 
elimination  of  vertices  that  occurs  in  Gaussian  elimination.  Using  an  elimination 
ordering  that  gives  expressions  of  small  total  complexity,  it  will  be  shown  that  the 
elimination  sequence  can  be  reordered  to  obtain  expressions  of  small  depth. 


5.1  Elimination  Cliques 

In  the  previoxis  chapter,  an  algorithm  was  presented  for  finding  2-sep2U'ators  in  SP 
graphs.  The  reduction  rules  proposed  for  SP  graphs  repeatedly  modified  the  graph 
until  a  single  edge  remained.  It  was  shown  that  the  resulting  decomposition  tree 
for  SP  graphs  gave  information  about  paths  through  the  graphs,  and  that  in  this 
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respect,  was  very  similar  to  the  result  of  Gaussian  elimination  for  Boolean  systems. 
In  fact,  the  Series  reduction  rule  is  exactly  the  process  that  occurs  in  Gaussian 
elimination  when  a  vertex  of  degree  two  is  removed:  one  vertex  is  removed  and  a 
fill-in  edge  is  added  to  the  graph. 

An  important  observation  of  the  information  presented  in  Chapter  4  is  that  the 
endpoints  of  each  edge  and  fill-in  edge  represented  a  separator  of  the  graph.  In 
the  elimination  of  SP  graphs,  each  edge  that  fills  in  does  so  to  maintain  paths  that 
existed  between  two  vertices  before  the  elimination  of  a  vertex.  Theorem  4  stated 
that  an  edge  (v,  w)  fills  in  only  if  there  is  a  path  from  v  to  w  containing  only  vertices 
eliminated  before  either  v  or  in.  The  endpoints  of  such  am  edge  separate  all  vertices 
comprising  the  path  from  the  rest  of  the  graph.  Since  an  eliminated  vertex  in  an  SP 
graph  may  only  contribute  to  one  fill-in  edge,  it  is  possible  to  record  the  vertices  in 
the  paths  defined  by  a  fill-in  edge  by  using  an  appropriately  labeled  decomposition 
tree  for  each  edge  in  the  graph. 

Each  step  in  the  forward  elimination  phase  of  Gaussian  elimination  results  in 
the  removal  of  a  single  vertex,  v,-,  from  the  graph.  For  SP  graphs,  a  vertex  has 
exactly  two  neighbors  at  the  time  of  its  elimination.  The  basic  operation  is  the 
same  whether  there  is  an  edge  between  the  neighbors  of  v,-  or  not.  This  one  rule  is 
illustrated  in  Figure  27  and  indicates  an  optional  edge  between  two  vertices.  If  there 
is  an  edge,  elimination  of  v,'  does  not  result  in  a  fill-in  edge,  but  merely  a  labeling 
of  the  edge  with  a  new  value.  If  there  is  not  an  edge,  elimination  of  v,-  results  in  a 
true  fill-in  edge.  In  either  situation,  the  set  of  the  neighbors  of  v,-  is  a  2-separator 
of  the  graph,  separating  one  or  more  vertices  from  all  others  in  the  graph.  For 
the  purpose  of  finding  separators,  a  single  reduction  rule  may  be  defined:  the  SP 
reduction  rule.  The  application  of  this  one  rule  involves  merely  finding  any  vertex 
with  two  neighbors. 

In  an  arbitrary  graph,  there  may  be  more  than  two  neighbors  of  v,.  However, 
the  set  adj{vi)  still  serves  to  separate  v,-  from  the  graph.  Because  the  number  of 
neighbors  of  v,-  may  be  greater  than  two,  the  result  of  elimination  of  v,  is  not,  in 
general,  a  single  fill-in  edge,  but  is  the  set  of  edges 

C'(vt)  =  {(u,«;)|u  G  G  ody(u,),u  7^  u>}. 
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Figtire  27:  A  Single  Reduction  Rule  for  SP  graphs 

This  is  similar  to  the  deficiency  of  a  vertex,  but  includes  edges  already  present  in 
the  graph  as  well.  The  set  of  vertices,  ady(v,),  and  edges,  form  a  clique' 

in  the  graph,  Gi,  resulting  after  elimination  of  v,-.  The  edges  represent  sections  of 
all  paths  of  which  v,-  was  a  member.  In  SP  graphs,  there  was  exactly  one  of  these 
edges.  However,  in  an  arbitrary  graph,  v,-  is  a  member  of  all  of  the  paths  defined 
by  the  edges  C(vj). 

Elimination  of  another  vertex,  €  adj{vi),  results  in  another  clique  in  the 
graph  The  edges  of  this  clique  contain  information  about  v,>i  and  v,-,  since 

{adj{vi)  -  {wi+i})  C  adj{vi+i) 

and  paths  between  members  of  adj{vi+i)  through  previously  eliminated  vertices 
include  paths  through  adj{vi).  This  can  be  stated  as  the  following: 

Theorem  6  In  the  elimination  process,  the  vertices  defined  by  adj{yi)  separate  v, 
from  the  rest  of  the  graph.  If  Vj  €  adj{vi)  and  j  >  i,  then  adj{vj)  also  separates  Vi 
from  the  rest  of  the  graph. 

The  structure  appearing  in  this  process  is  the  e/iminotion  clique  and  its  boxmdary 
vertices.  In  SP  graphs,  this  was  a  single  edge,  and  a  labeling  scheme  could  simply 
use  edges  to  keep  track  of  information  concerning  the  constituent  vertices  of  a  path. 
However,  the  arbitrary  sized  elimination  clique  suggests  no  such  simple  labeling 
scheme. 

This  clique  is  sometimes  referred  to  as  «a  elimination  clique. 
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Figure  28:  Reduction  Rules  for  Arbitrary  Graphs 


Figure  29:  A  Decomposition  Tree  Fragment 

5,2  Reduction  Rules  for  Arbitrary  Graphs 

A  simple  set  of  reduction  rules  for  arbitrary  graphs  czm  be  formulated  in  terms  of 
the  Gaussian  elimination  procedure  disctissed.  Each  rule  removes  a  single  vertex 
from  the  graph  and  adds  fill-in  edges  to  keep  all  existing  paths  present.  The  new 
reduction  rules  have  as  their  single  criterion  for  application  the  number  of  neighbors 
of  vertex  w,-.  Figure  28  shows  a  few  of  the  reduction  rules  for  small  numbers  of 
neighbors. 

These  reduction  rules  may  be  built  into  decomposition  trees  that  au‘e  similar 
to  those  presented  for  SP  graphs.  However,  these  trees  are  no  longer  binary  and 
will  have  two  types  of  nodes:  separator  sets  and  reduction  rules.  A  reduction  rule 
contains  the  vertex  it  eliminated,  v,,  and  is  also  associated  with  a  separator  set  node 
which  contains  adj{vi).  We  can  represent  reduction  rules  graphically  as  shown  in 
Figure  29. 

After  application  of  a  reduction  rule,  the  graph  has  one  fewer  vertex,  and  a 
decomposition  tree  fragment  has  been  constructed.  These  fragments  are  joined 
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Figure  30:  Elimination  of  vj  and  the  Resulting  Decomposition  Tree  Fragments 


Figure  31:  The  Resulting  Decomposition  Tree 

into  a  tree  structure  whenever  possible.  Consider  elimination  of  vy  as  shown  in 
I  Figure  30.  There  are  now  two  decomposition  tree  fragments.  Through  application 

of  Theorem  6, it  is  known  that  since  Vy  €  adj{vi),  <idj{vj)  separates  v,-  from  the  graph 
as  well.  This  relationship  can  be  reflected  naturally  in  the  joining  of  tree  fragments 
as  shown  in  Figure  31. 

An  algorithm  for  the  construction  of  a  decomposition  tree  is  presented  in  Fig¬ 
ure  32.  Initially,  there  is  a  graph  G  and  an  empty  set  of  tree  fragments,  F.  Applica¬ 
tion  of  a  reduction  rule  to  a  vertex  v  results  in  a  new  tree  fragment  with  a  separator 
set  node  £,  and  reduction  rule  node  R.  Any  existing  fragments  in  F  whose  root 
(which  is  a  sepzirator  set  node)  contains  v  is  a  candidate  for  joining  as  described 
above.  At  the  termination  of  the  algorithm,  there  is  one  tree  fragment  in  F,  which 
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is  a  decomposition  tree  for  the  entire  graph. 

This  algorithm  poses  an  interesting  problem.  The  order  of  selection  of  the 
vertices  determines  the  sepaurator  sets  found  as  well  as  their  size.  Since  it  is  desired 
to  find  separator  sets  of  minimum  size,  the  problem  is  identical  to  that  of  minimizing 
d(v)  in  Gaussian  elimination.  Once  again,  the  minimum  degree  algorithm  is  a  good 
heuristic  that  minimizes  d{v)  for  all  v  in  the  types  of  graph  with  which  this  report 
is  most  concerned:  GSP  graphs  and  other  graphs  that  may  be  reduced  by  rules  of 
small  order. 


5.3  A  Separator  Theorem  for  a  Bipartite  Tree 

A  theorem  presented  for  an  unusual  tree  will  later  be  useful  for  finding  sepeu-ators 
for  arbitrary  graphs.  This  tree  has  two  types  of  nodes,  red  auid  black.  A  red  node, 
r,  has  internal  weight,  t(r)  =  1,  and  a  black  node,  6,  has  internal  weight,  t(6)  =  0. 
The  total  weight  of  a  node  n  (of  either  type)  is  the  sum  of  the  internal  weights  of 
all  descendants  of  n. 

«'(«)  =  *{p) 

p€d*ie€ti^rUt{n) 

Each  of  these  nodes  may  have  any  number  of  children,  which  must  be  of  the  opposite 
type;  thus,  it  is  a  bipartite  tree.  The  root  of  the  tree  is  a  black  node  and  leaves  of 
the  tree  ue  red  nodes.  Figure  33  shows  a  saimple  red-black  tree  with  appropriate 
weights  labeled.  Red  nodes  are  represented  by  circles,  and  black  nodes  by  squeu'es. 

The  weight  of  the  root  node  w(root)  =  W  indicates  how  many  red  nodes  there 
are  in  the  tree.  If  the  number  of  children  of  all  red  nodes  in  the  tree  is  bounded  by 
some  constant,  k,  then  we  can  show  that  there  is  a  black  node  whose  removal  from 
the  tree  causes  the  tree  to  split  into  parts,  none  of  which  will  have  weight  greater 
than 


Theorem  7  A  Red-Black  tree  of  total  weight  W  m  which  no  red  node  has  more 
than  k  children  has  a  black  node  b  that  separates  the  tree  into  parts,  each  of  which 
has  weight  less  than  or  equal  to 
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{Given  G  =  (V,  E)  } 

type  SeparatorSet  =  record 
set,  sons 
endrecord; 

type  ReductionRule  =  record 
vertex,  order,  sons 
endrecord; 

ReductionRule  R; 

SeparatorSet  S; 

F  *-<d 

for  t  =  1  to  n  do 

E*-  EuDivci)] 
new  R;  S  4—  null; 

R.vertex  *—  v*,.;  R.order  ♦-  d(«a,.);  R-sons  «—  0; 

{5ee  if  a  separator  set  with  these  elements  has  already  been  found) 
for  each  SeparatorSet  t  €  F  do 

if  adj{vai)  ^  S  ^  then  t.8ona  ♦-  t.sons  u{R};S«— tfi 

od 

if  S  =  null  then 

new  S;  S.set  ♦—  o<^(va,);  S.sons  <—  {  R  } 
fi 

{  TVy  to  find  sets  to  add  as  descendants  } 
for  each  SeparatorSet  t  €  F  do 
if  Vg.  €  t.set  then 

R.8ons  4—  R.sons  U  {t}; 

F  4-  F-t; 
fi 

od 

F  4-  Fu  S; 
od 


Figure  32:  An  Algorithm  to  Produce  a  Decomposition  Tree  for  an  Arbitrary  Graph 


r 
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Figure  33:  A  Red-Bladc  Tree 

PROOF:  Black  nodes  are  examined  starting  at  the  root  of  the  tree  and  descending 
through  the  tree.  It  is  obvious  that  some  black  node,  6x,  can  be  found  whose  weight 
meets  the  criterion,  but  it  must  also  be  shown  that  the  total  weight  of 

all  nodes  not  descendants  of  the  current  black  node,  W  —  ti;(6i),  does  not  exceed 
the  criterion. 

INITIALIZATION:  bi  is  the  root  of  the  tree. 

INVARIANT:  W  —  w{bi)  <  +  jL.  This  b  true  for  the  initial  case  since 

W  =  u;(6i). 

General  Step:  Find  the  child  of  6x  with  the  greatest  weight.  Call  this  node  r. 
If  ttf(r)  <  IV  +  then  6i  b  the  node  that  satbfies  the  required  criterion. 

Otherwbe,  find  the  largest  child  of  r  and  call  it  6,.  If  u;(6j)  >  then 

repeat  the  general  step  with  bi  6j.  The  invariant  still  holds. 

Otherwbe,  it  b  claimed  that  63  b  the  black  node  that  separates  the  tree  as 
required.  It  b  known  that  w(63)  meets  the  criterion.  It  remains  to  be  shown  that 
the  weight  of  all  nodes  that  are  not  descendants  of  63,  VK  —  w(b3),  does  not  exceed 
the  criterion.  Since  a  red  node  may  have  at  most  k  children,  it  b  known  that 

«;(6j)  X  k  +  1  >  «/(r)  > 

+  _ i 

k  +  1  k[k  +  l)  k 


Therefore, 


U>(63)  > 
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and, 


Finally, 


□ 

An  actual  implementation  of  the  algorithm  as  described  appears  in  Figure  34. 
This  ailgorithm  simply  implements  the  tests  described  in  the  proof.  It  will  be  iised 
later  in  this  chapter  as  part  of  another  algorithm. 

5.4  Using  a  Decomposition  Tree  to  find  Separa¬ 
tors 

In  the  decomposition  tree  proposed,  each  separator  set  node  is  a  valid  separator  of 
the  graph  for  some  value  of  a.  This  section  will  examine  how  a  good  separator  for 
the  graph  may  be  selected,  and  what  values  of  a  caui  be  expected  depending  on  the 
reduction  rules  applied. 

The  decomposition  tree  proposed  in  this  chapter  is  in  fact  a  red-black  tree. 
Separator  set  nodes  correspond  to  black  nodes,  reduction  rule  nodes  to  red  nodes. 
The  weights  proposed  for  red-black  trees  calculate  the  number  of  reduction  rules 
under  any  given  node  in  the  decomposition  tree,  and  hence,  give  the  number  of 
vertices  of  the  graph  separated  by  a  given  separator  set.  Removal  of  a  black  node 
in  the  tree  corresponds  to  the  removal  of  the  separator  set  from  the  graph.  The 
number  of  nodes  in  each  remaining  part  of  the  graph  is  the  total  weight  of  the 
corresponding  part  of  the  decomposition  tree.  Theorem  7  states  that  this  can  be 
bounded  for  some  value  of  k.  For  a  decomposition  tree,  k  is  the  maximum  number 
of  children  of  any  reduction  rule  in  the  tree.  It  will  be  shown  that  k  is  dependent 
on  the  reduction  rules  applied. 


•"(‘i)  >  r^rW'  -  ’ 


*-»-  1  k+1 


W  -  wibi)  <W  -  -  T^) 

^  ^  -  'ik  -I-  1  k  +  V 

W  -  «;(&,)  <  -^W  -h 

k+1  k+1 
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function  RedBlack(tre«  T,  int  k,  int  W)  returns  tree; 
if  red(T)  then 
max  *—  0; 

for  p  €  ehildren{T)  do 

if  tu(p)  >  u;(max)  then  max  *—  p 
od 

RedBlack(max,  k,  W); 

else 

{Are  at  a  black  node  } 
if  tu(T)  <  W  +1^  then 
return  T; 

else 

max  *—  0; 

for  p  €  ehHdren(T)  do 

if  w(p)  >  w(max)  then  max  *-  p 
od 

If  ^(p)  <  W  then 
return  T; 

else 

RedBlsck (max,  k,  W); 
fi 
fi 
fi 

end  RedBIack; 


r\  nr.  nn.iv\  %r\  vT«n  vi  kt  v!  v  *vt  jv\  w  vr  v'  v»  v  ^  v.  w\w  vv.nrwv^www  wvnrw\m.wwvww\^i.'vvTi 
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Fig\ire  35:  A  Reduction  Rule  of  Order  Two 

The  number  of  children  of  any  reduction  rule  is  directly  related  to  the  order  of  the 
reduction  rule  applied.  (Rq  is  called  order  0,  Ri  order  1,  Ri  order  2,  etc.)  Consider 
a  reduction  rule  of  order  two  as  shown  in  Figure  35.  The  pzu-ent  node  of  this  rule 
must  be  a  separator  set  contaiining  two  vertices.  The  only  possible  separator  sets 
that  could  be  children  of  the  reduction  rule  are  those  that  include  the  vertex,  v, 
and  some  subset  of  the  parent  separator  set,  {u,u;}.  The  four  possible  subsets  are: 
{u},  {u,w},  {v,u;},  and  {u, «,«>}.  For  any  reduction  rule  of  order  k,  there  may  be, 
at  most,  2^  separator  sets  as  children. 

5.4.1  Bounding  the  Number  of  Children 

The  previous  section  stated  that  a  reduction  rule  of  order  k  could  have,  at  most, 
2*  children.  This  is  true  only  in  the  worst  case.  Consider  a  graph  in  which  all 
reduction  rules  axe  of  order  two  (an  SP  graph  for  example).  While  each  reduction 
rule  has  the  potential  for  having  four  children,  each  will  have  at  most  two,  since 
each  of  its  children  must  be  sets  containing  two  vertices.  In  general,  if  a  graph  caui 
be  described  entirely  by  reduction  rules  of  order  k,  then  each  reduction  rule  will 
have,  at  most,  k  children. 

For  decomposition  trees  in  which  there  are  reduction  rules  of  varying  orders, 
the  number  of  children  of  a  reduction  rule  may  be  limited  by  merging  some  of  its 
child  separator  sets  together.  This  may  increase  the  size  of  the  separator  sets,  but 
decreases  the  number  of  children  of  the  reduction  rule. 

As  stated  eairlier,  a  reduction  rule  may  have  as  children  the  separator  sets  whose 
elements  are  the  eliminated  vertex  and  some  subset  of  the  pau'ent  separator  set.  Of 
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the  2*  separator  sets  which  may  be  children  of  the  reduction  rule  of  order  k,  k  of 
them  are  “majdmal”  subsets  of  k  elements  in  the  sense  that  all  other  separator  sets 
of  k  or  fewer  elements  are  subsets  of  one  of  these  sets.  If  there  is  no  reduction  rule 
in  the  tree  of  order  A:  +  1,  then  it  is  known  that  these  k  separator  sets  will  be  the 
only  children  of  the  reduction  rule. 

The  algorithm  of  Figure  36  is  similar  to  the  algorithm  of  Figure  32  but  ensures 
the  merging  of  separator  sets  by  creating  the  k  maximal  subsets  of  a  reduction  rule 
as  its  children.  A  procedure  called  MakeSubsets  is  used  which  constructs  the  set 
of  maximal  separator  sets.  When  an  existing  separator  set  is  found  which  is  a  subset 
of  any  of  these  maximal  subsets,  it  is  simply  merged  into  the  set,  adding  its  children 
to  the  list  of  children  of  the  newly  created  maximal  subset.  In  the  event  the  existing 
separator  set  is  not  a  subset  of  any  of  the  newly  created  maximal  subsets  (in  which 
case  it  must  be  the  one  possibl  subset  of  it  +  1  elements),  it  is  simply  added  as  a 
child  of  the  reduction  rule.  By  applying  this  process,  decomposition  trees  may  be 
created  which  obey  the  following  theorem. 

Theorem  8  In  the  decomposition  tree  for  o  graph  described  entirely  by  reduction 
rules  of  order  k  or  less,  each  reduettou  rule  will  have,  at  most,  k  children. 

This  theorem  and  combined  with  Theorem  7  leads  to  the  following: 

Lemma  1  A  graph  described  entirely  by  reduction  rules  of  order  k  or  less  has  a 
k-separator  with  ot  =  —  +  and  0  =  1. 

5.5  A  Nested  Dissection  Ordering  Algorithm 

The  decomposition  tree  can  be  used  to  produce  an  elimination  ordering  in  much  the 
same  manner  as  the  algorithm  proposed  for  decomposition  trees  of  SP  graphs.  The 
algorithm  of  Figure  37  takes  as  arguments  a  decomposition  tree  T,  and  a  separator 
criterion  k.  Auxiliary  procedures  are  defined  as  follows: 

RedBlack:  Finds  the  separator  set  node  in  the  decomposition  tree  that  meets  the 


jr«  "vK  ^  jr  ^  njt  *ji  ^  "ii  »ji  »ji  *jr’\F  "JTy  .^nP  "if  '"i^  ■’y.."^'.  v.^jxrjyr: 
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{Given  G  =  (V.E)  } 

type  SeparatorSet  =  record 
set,  sons 
endrecord; 

type  ReductionRule  =  record 
vertex,  order,  sons 
endrecord; 

ReductionRule  R;  SeparatorSet  S; 

F  ^  0 

for  I  =  1  to  n  do 
V^V-  v«.; 

E*-  EuD(vaJ; 
new  R;  S  ♦—  null; 

R. vertex  <—  v<,.;  R. order  <—  d(wa.);  R.sons  ♦—  MakeSub8et8{va-,adj{va,))', 

{See  if  a  separator  set  with  these  elements  has  already  been  found) 
for  each  SeparatorSet  t  €  F  do 

If  adj{vai)  C  t.set  and  S  ^  null  then  t.sona  «—  t.sons  u{R};S«— tfi 
od 

if  S  =  null  then 

new  S;  S.net  <—  adj{va;)',  S.sons  <—  {  R  } 
fi 

{  Try  to  find  sets  to  add  as  descendants  } 
for  each  SeparatorSet  t  €  F  do 
if  Va,  €  t.set  then 

{  See  if  there  is  a  superset  present  } 
bool  found  =  FALSE; 
for  each  SeparatorSet  s  €  R.sons  do 
if  t.set  C  s.set  and  not  found  then 

s.sons  <—  s.8ons  U  t.sons;  found  =  TRUE; 
fi 

od 

if  not  found  then  R.sons  <—  R.sons  U  {t};fi 
F4-  F-t; 
fi 

od 

F  ♦-  Fu  S; 
od 


Figure  36:  Modified  Algorithm  to  Produce  Decomposition  Tree  for  an  Arbitrary 
Graph 
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procedure  GeneralizedOrdering(  tree  T,  iut  k); 
if  T  is  a  leaf  reduction  rule  then 
Number(T.  vertex) ; 
return; 
fi 

p  *—  RedBIack(T,  k,  ti;(T)); 
for  each  v  &  T.separator  do 
Number  (v); 
od 

S  *—  p.sona; 

{Remove  this  separator  set  node  from  the  tree} 
(p.parent).sons  =  (p. parent). sons  —  p; 
RelabelParents(p) ; 

GeneralizedOrdering(T) ; 
for  each  p  £  5  do 

GeneralizedOrdering  (p) ; 
od 

end  GeneralizedOrdering; 


Figure  37:  Elimination  Ordering  Algorithm  for  Generalized  Decomposition  Trees 
given  k  value  and  returns  this  node. 

Number:  Receives  a  vertex  as  an  argument  and  assigns  it  am  elimination  number. 
This  function  begins  numbering  at  n  and  number  vertices  in  a  descending  order. 
The  function  “remembers”  those  vertices  that  have  already  been  numbered,  and 
does  not  re-number  them. 

RelabelParents:  The  tree  is  assumed  to  have  weights  initially  calculated  for  each 
node.  Removal  of  a  node  requires  relabeling  of  the  weights  of  only  the  nodes  above 
it  in  the  tree.  This  procedure  follows  parent  links  to  perform  this  renumbering. 
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5.6  A  Bound  on  Complexity 

The  total  number  of  operations  in  a  DAG  produced  by  this  method  is  a  function  of 
the  largest  reduction  rule  applied.  Consider  a  graph  which  is  described  by  reduction 
rules,  all  of  which  are  of  order  k  or  less.  Each  separator  set  will  have  k  or  fewer 
vertices,  as  will  the  subgraphs  defined  by  the  set  of  vertices  not  belonging  to  any 
separator  set.  The  elimination  degree  of  ezu:h  of  the  vertices  belonging  to  these 
subsets  cam  be  no  more  than  d(v)  <  3k  for  the  following  reason.  Eacih  of  the 
vertices  can  be  adjacent  to  two  separator  sets  which  have  no  more  than  k  vertices, 
auid  may  also  be  adjacent  to  vertices  in  its  own  subgraph  of  which  there  are  fewer 
than  k  vertices.  Thus,  each  of  these  vertices  will  have  d{v)  <  3k. 

Similarly,  at  the  time  of  their  elimination,  each  of  the  vertices  in  the  separator 
sets  may  be  adjacent  to  two  separator  sets  (of  k  vertices),  and  to  the  other  ver¬ 
tices  of  its  own  separator  set.  Thus,  each  of  these  vertices  has  d(u)  <  3k.  These 
facts,  coupled  with  Equation  1,  gives  the  total  complexity  of  performing  Gaussian 
elimination  by  this  method  as  no  more  than 

(9A;*  +  3k)n, 

which  is 

0{k^n). 

Thus,  k  may  be  viewed  as  measuring  the  sparsity  of  a  graph.  Small  values  of  k 
occur  for  graphs  which  are  sparse,  while  the  largest  value  k  may  take  on,  n,  occurs 
for  a  complete  graph  for  which  the  complexity  of  performing  Gaussian  elimination 
is  known  to  be  O(n’).  The  expression  above  correctly  captures  this  fact. 

5.7  A  Bound  on  Depth 

The  DAG  generated  by  Gaussian  elimination  has  depth  dependent  on  the  manner  in 
which  vertices  are  eliminated.  The  generalized  nested  dissection  algorithm  presented 
in  this  chapter  can  be  analyzed  for  bounds  on  the  depth  of  DAGs  generated. 


CHAPTER  5.  SEPARATORS  FOR  ARBITRARY  GRAPHS 


62 


Figure  39:  A  Separation  Set  aind  Two  External  Vertices 
the  formula  for  describing  the  depth  of  a  clique. 

+  4  +  [lg(A:  +  1)] 

The  value  of  dftp(k)  has  been  calculated  for  a  few  values  of  k  as  shown  in  the  table 
below.  The  function  grows  very  slowly;  its  asymptotic  growth  is  0(k  Igfc). 


k 

1 

5 

2 

10 

3 

15 

4 

21 

5 

28 

6 

35 

7 

42 

8 

50 

From  the  decomposition  tree  for  a  given  graph,  the  size  of  the  maximum  reduc¬ 
tion  rule  applied,  k,  can  be  found.  This  value  also  bounds  the  best  value  of  a  that 
can  be  achieved  using  the  separators  shown  in  the  decomposition  tree.  Each  of  the 
subgraphs  defined  by  the  separator  will  be  eliminated  to  a  single  vertex  in  a  similar 
manner,  with  a  constant  amount  of  depth  added  to  the  DAG.  The  size  of  the  largest 
subgraph  is  guau-anteed  to  be  roughly  of  the  total  graph. 
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The  following  recurrence  describes  the  manner  in  which  depth  grows  for  a  given 
number  of  vertices  in  a  graph,  n,  aind  the  maximum  sized  sepao’ator,  k.  Dk{n)  is  a 
function  that  calculates  the  upper  bound  on  the  depth  of  DAG  generated  for  a  given 
number  of  vertices  with  a  maximum  reduction  rule  of  order  k. 

C.(n)  = 

The  solution  to  this  recurrence  is  given  by 


I<t<logn^l  n 

which  is 

0{{k  Ig  A:]  logi|j  n) 

Thus,  sublinear  growth  of  the  depth  of  DAG  can  be  guaranteed  with  this  method. 
However,  as  k  gets  large,  the  coefficients  increase  and  the  logarithmic  growth  of 
depth  degrades. 


5.8  SP  and  GSP  Graphs  Revisited 

The  separator  theorem  developed  in  thb  chapter  can  be  seen  to  be  a  generalization 
of  the  separator  theorem  for  series-parallel  graphs.  SP  graphs  can  be  completely 
described  by  rules  of  order  k  or  less  with  Jt  =  2.  Therefore,  for  SP  graphs  the 
generalized  decomposition  tree  finds  2-8eparators  with  a  =  |- 

General  Series-Parallel  graphs  were  similar  to  SP  graph  except  that  they  in¬ 
cluded  acyclic  branches.  This  class  of  graph  is  important  in  that  they  may  be  used 
to  describe  most  MOS  circuits  that  arise.  GSP  graphs  may  be  described  by  reduc¬ 
tion  rules  of  order  1  and  2.  Therefore,  1-  and  2-separators  for  GSP  graphs  with 
a  =  I  may  be  found,  and  the  DAGs  representing  the  solution  of  such  a  network  will 
have  depth  that  grows  as  0(lgn]. 
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5.9  Experimental  Results 

The  generalized  nested  dissection  algorithm  described  in  this  chapter  has  been  im¬ 
plemented  and  tested  for  a  variety  of  graphs  that  occur  in  MOS  networks.  Channel 
graphs  for  some  of  the  networks  are  shown  in  Appendix  B.  The  DAGs  generated 
were  assumed  to  represent  binary  Boolean  operators  and  were  analyzed  for  their 
minimum  depth.  DaGs  may  be  rearranged  to  decrease  their  depth  through  a  very 
simple  procedure.  Often,  the  elimination  process  results  in  a  long  chain  of  V  op¬ 
erators  that  may  be  rebalanced  into  a  shorter  tree.  This  was  accounted  for  in  the 
program  that  analyzed  the  minimum  depth  of  a  DAG. 

For  all  coi';  arisons,  the  results  of  a  minimum  degree  ordering  algorithm  along 
with  the  results  of  the  generalized  nested  dissection  algorithm  have  been  shown. 
The  minimum  degree  algorithm  has  been  observed  to  minimize  the  total  number  of 
operations  for  most  of  the  graphs  presented,  while  the  generalized  nested  dissection 
algorithm  tends  to  minimize  depth.  In  most  cases,  there  is  a  tradeoff  between  these 
two:  a  decrease  in  depth  results  in  higher  total  complexity. 

This  first  group  of  graphs  contains  linear  chains  of  the  number  of  vertices  indi¬ 
cated  in  Table  2.  While  chains  of  the  lengths  presented  do  not  occur  often  in  real 
networks,  they  dramatically  illustrate  the  tradeoff  between  depth  and  complexity. 

D  AGs  resulting  from  the  minimum  degree  algorithm  are  essentially  a  straight  line 
of  operators  in  which  only  one  operation  may  be  performed  at  a  time.  However,  at 
the  increased  expense  of  more  total  operations,  the  DAG  produced  by  the  generalized 
nested  dissection  algorithm  has  significantly  less  depth.  For  the  longest  chain  of  100 
nodes,  the  DAGs  differ  in  depth  by  almost  a  factor  of  10,  indicating  that  a  machine 
capable  of  parallel  evaluation  could  evaluate  the  response  of  a  network  described 
by  this  graph  ten  times  faster  than  a  general  purpose  computer. 

Shift  networks  present  a  difficult  problem  for  Gaussian  elimination.  No  algo¬ 
rithms  that  result  in  a  low  value  of  total  operations  have  been  found.  Table  3  shows 
the  results  obtained  for  three  different  sixteen  bit  shifters  that  rotate  data  one  of 
three  possible  bit  positions.  Usually,  the  generalized  nested  dissection  algorithm 
resulted  in  lower  depth  with  nearly  the  same  total  number  of  operations.  These 
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Chain 

Minimum  Degree 

Generalized  Nested 

Length 

Algorithm 

Dissection 

(n) 

depth 

complexity 

depth 

complexity 

10 

36 

36 

17 

60 

20 

76 

76 

25 

164 

30 

116 

116 

27 

284 

40 

156 

156 

33 

388 

50 

196 

196 

35 

308 

60 

236 

236 

35 

628 

70 

276 

276 

41 

732 

80 

316 

316 

41 

852 

90 

356 

356 

41 

972 

100 

396 

396 

43 

1092 

Table  2:  Results  for  Linear  Chains  of  Vaurying  Length 


Shifter 

Minimum  Degree 

Generalized  Nested 

Algorithm 

Dissection 

depth 

complexity 

depth 

complexity 

8hftl6-014 

112 

1632 

100 

1632 

shftl6-012 

126 

904 

92 

924 

shftl6-013 

130 

1364 

109 

1364 

Table  3:  Results  for  16-bit  shifters 

results  are  encouraging.  While  the  savings  in  depth  were  not  large,  there  was  no 
extra  cost  in  total  operations  for  two  of  the  shifters  and  very  little  for  the  third. 

Logical  shifters  do  not  exhibit  the  same  level  of  complexity  for  solution  that 
rotational  shifters  do.  The  results  are  shown  in  Table  4  and  are  very  interesting 
with  respect  to  the  tradeoff  between  depth  and  complexity.  Two  of  the  networks 
could  be  solved  with  a  significant  savings  in  depth  with  a  corresponding  increase  in 
complexity.  The  other  two  were  solved  with  negligible  savings  in  depth,  but  with 
no  increase  in  total  operations. 

A  number  of  RAM  cells  are  shown  in  Table  5.  Each  of  these  networks  can  be 
described  by  reduction  rules  of  order  1.  Parity  ladders  of  varying  numbers  of  nodes 
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Shifter 

Minimum  Degree 

Generalized  Nested 

Algorithm 

Dissection 

depth 

complexity 

depth 

complexity 

Ishft  16-01 

128 

128 

29 

304 

lahftl6-012 

174 

372 

64 

748 

lahftl6-014 

97 

832 

92 

832 

Uhftl6-018 

77 

616 

74 

616 

Table  4:  Results  for  16-bit  logical  shifters 


RAM 

Minimum  Degree 
Algorithm 
depth  complexity 

Generalized  Nested 
Dissection 
depth  complexity 

ram4 

28 

32 

13 

40 

raml6 

39 

100 

20 

132 

ram32 

41 

164 

21 

196 

ram64 

50 

340 

23 

436 

ram256 

61 

1220 

31 

1476 

Table  5:  Results  for  RAMs  of  Vairying  Sizes 

are  shown  in  Table  6.  Each  of  these  may  be  described  by  reduction  rules  of  order 
3,  and  hence,  have  S-separators.  The  results  of  performing  Gaussian  elimination 
on  both  of  these  types  of  networks  shows  a  logarithmic  growth  of  depth.  These 
results  indicate  that  the  algorithm  does,  in  fact,  perform  well  for  graphs  that  can 
be  described  by  small  reduction  rules. 

A  few  miscellanous  networks  are  shown  in  Table  7.  Seradd.a  and  straid.h  repre¬ 
sent  two  separate  subnetworks  in  a  serial  adder  circuit,  and  par^  is  a  parity  genera¬ 
tor  for  four  bits.  These  results  once  again  illustrate  the  tradeoff  between  depth  and 
complexity,  indicating  that  the  generalized  nested  dissection  algorithm  degrades  to 
a  level  of  performance  comparable  to  that  of  the  minimum  degree  algorithm  for 
graphs  without  good  separators. 
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Ladder 


Minimum  Degree 
Algorithm 

complexity 

37 

62 

92 

152 

147 

242 

202 

332 

257 

422 

312 

512 

Dissection 


ipth 


39 

85 

97 

113 

150 

155 


Table  6:  Results  for  Parity  Ladders  of  Varying  Number  of  Vertices 


Network 


seradd.a 

•eradd.b 

par4 


Minimum  Degree 
Algorithm 

depth 

complexity 

23 

64 

17 

24 

25 

88 

Generalised  Nested 
Dissection 


Table  7:  A  Few  Random  Networks 


Chapter  6 


Discussion 


This  report  has  analyzed  the  expressions  generated  by  Gaussian  elimination  with 
a  metric  not  widely  discussed.  While  the  depth  of  an  expression  has  not  been  an 
issue  in  the  past,  the  introduction  of  highly  parallel  hardware  makes  it  an  importamt 
consideration.  It  was  shown  that  depth  can  be  minimized  by  the  proper  selection 
of  an  elimination  ordering  through  nested  dissection  and  that  the  growth  of  depth 
can  be  bounded  as  a  function  of  characteristics  of  the  graph. 

Two  separate  topics  have  been  examined  in  this  report.  One  is  a  method  for 
solving  systems  of  Boolean  equations,  and  the  other  is  a  method  of  finding  separators 
in  arbitrary  graphs.  It  is  curious  that  the  general  framework  of  Gaussian  elimination 
is  common  to  the  solution  of  both  problems.  The  use  of  sepairators  to  produce  an 
elimination  ordering  for  Gaussian  elimination  results  in  a  strange  double  use  of  the  i 

general  Gaussian  elimination  algorithm:  once  to  find  separators  and  then  again  to 
actually  solve  the  system.  In  fact,  the  actual  code  was  written  as  two  identical 
Gaussian  elimination  shells,  with  operators  replaced  as  necessary. 

The  method  presented  for  computing  an  elimination  ordering  is  especially  suited 
to  those  graphs  that  can  be  completely  decomposed  using  reduction  rules  of  small 
order.  GSP  graphs  were  shown  to  be  able  to  be  eliminated  with  all  vertices  having 
elimination  degree  of  two  or  less,  resulting  in  efficient  solution  of  systems  described 
by  GSP  graphs  with  small  total  depth.  This  is  fortunate,  since  the  original  applica¬ 
tion  was  the  simulation  of  MOS  networks,  most  of  which  can  be  described  by  GSP 
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graphs. 

While  the  main  intent  of  this  work  was  to  find  a  way  of  optimizing  the  DAGs 
describing  MOS  networks  for  depth,  the  method  is  applicable  to  any  graph.  The 
order  of  reduction  rules  applied,  k,  was  shown  to  determine  the  depth  of  the  resulting 
DAG.  Experimental  evidence  abo  showed  that  there  seemed  to  be  a  tradeoff  between 
depth  and  total  complexity.  For  graphs  that  did  not  have  good  separators  by  the 
method,  the  resulting  DAG  was  similar  in  depth  and  complexity  to  that  produced  by 
the  minimum  degree  elimination  algorithm,  which  is  considered  to  be  good  for  many 
classes  of  graphs.  This  is  a  desirable  characteristic,  indicating  that  the  algorithm 
proposed  degrades  gracefully,  rather  th2ui  simply  not  working  at  all. 


6.1  Future  Considerations 

This  work  has  raised  a  number  of  interesting  questions.  The  basic  algorithm  used 
in  the  production  of  a  decomposition  tree  is  the  minimum  degree  algorithm;  other 
methods  should  be  examined.  In  particulau*,  a  nested  dissection  algorithm  should 
be  applied  at  this  stage  to  produce  an  elimination  ordering  for  finding  the  decom¬ 
position  tree.  This  tree  would  then  be  used  to  produce  an  elimination  ordering  to 
solve  the  system. 

Finally,  there  may  be  other  methods  of  explicitly  minimizing  the  depth  of  ex¬ 
pressions.  A  method  based  on  a  greedy  depth  algorithm  was  explored  and  looked 
promising,  but  the  results  were  inconclusive.  While  separators  divide  the  graph, 
they  do  not  guarantee  that  the  subgraphs  induced  have  nearly  the  same  ‘‘span” ,  or 
length  of  longest  path  in  the  graph.  Depth  is  clearly  related  to  how  far  (in  edges) 
information  travels  in  a  graph.  Explicit  minimization  of  this  factor  could  lead  to 
DAGs  of  even  smaller  depth. 


Appendix  A 

Graph  Theoretical  Definitions 


A  short  introduction  to  some  basic  terms  from  graph  theory  will  be  given  here. 
Most  axe  standard  (see  [17]). 

A  graph  G  =  {V,E)  consists  of  a  finite  set  of  n  =  |V|  vertices  and  a  finite  set 
of  m  =  |f?|  edges.  Edges  are  pairs  of  vertices  with  E  C  {(v,ti;)|v,u;  E  V,v  ^  u;}.  If 
(u.u;)  is  an  edge,  vertices  v  and  w  are  adjacent  and  edge  (v.iy)  is  incident  to  v  euid 
ty,  which  are  its  endpoints.  The  number  of  edges  incident  on  a  vertex  is  its  degree. 
The  set  of  vertices  adjacent  to  v  are  denoted  adj{v)  amd  are  sometimes  ako  called 
the  neighbors  of  v. 

If  the  edges  are  unordered  pairs  then  the  graph  is  undirected,  otherwise  it  is  a 
directed  graph.  An  edge  (v,u;)  in  a  directed  graph  has  a  tail  v  and  a  head  w.  A 
directed  graph  is  called  a  digraph.  A  vertex  v  of  a  digraph  has  an  associated  indegree 
and  outdegree.  The  number  of  edges  whose  head  is  v  is  the  vertex’s  indegree  and 
the  number  of  edges  whose  tail  is  v  b  its  outdegree.  A  vertex  whose  indegree  is  0 
b  called  a  source  and  a  vertex  whose  outdegree  is  0  b  called  a  sirik. 

If  the  set  of  edges  b  a  multiset,  that  is,  if  multiple  edges  between  the  same  two 
vertices  are  allowed,  then  the  graph  b  called  a  multigraph.  A  directed  multigraph 
b  called  a  multidigraph. 

Graph  G'  =  {V',E’)  is  a  subgraph  of  G  if  V'  C  V  and  F  C  E.  If  W  C  K,  the 
induced  subgraph  G{W)  =  {W,E{W))  where 


E{W)  =  {(t;,u;)  e  E\v,w  G  W}. 
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A  complete  graph  is  a  graph  in  which  each  pair  of  distinct  vertices  is  joined  by  an 
edge.  A  clique  of  a  graph  G  is  a  subset  5  of  V  such  that  G(5)  is  complete. 

A  path  of  length  k  between  vertices  v  and  u;  is  a  sequence  of  vertices  v  = 
*^0.  ■  •  • » t'*  =  ^  such  that  is  an  edge  for  1  <  i  <  k  and  all  the  ver¬ 

tices  Vi, •••,«*  atre  distinct.  If  v  =  to  the  path  is  a  cycle.  If  every  pair  of  vertices 
in  G  is  joined  by  a  path,  then  G  is  connected.  A  chord  is  an  edge  that  joins  two 
vertices  in  a  cycle  that  are  not  adjacent.  A  chordal  graph  is  one  in  which  every 
cycle  of  at  least  four  vertices  has  a  chord 

If  vertices  of  a  graph  can  be  pairtitioned  into  two  sets  Vi  and  Vj  such  that  every 
edge  has  one  endpoint  in  Vi  and  the  other  in  V2,  the  the  graph  is  a  bipartite  graph. 
Similarly,  if  the  nodes  of  a  tree  can  be  partitioned  into  two  sets  such  that  every 
edge  has  one  endpoint  in  each  of  the  sets,  then  the  tree  is  a  bipartite  tree. 
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Figure  40:  A  Linear  Chain  Graph  of  Ten  Vertices 


Figure  41:  A  Sixteen-Bit  Shifter 


Figure  42:  A  Sixteen-Bit  Logical  Shifter 
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